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L. A. DE ROSA 


L. A. de Rosa was born in Tenafly, N. J., on September 11, 1910. He received the BS 
degree in electrical engineering from the Polytechnic Institute of Brooklyn, where he 
continued with graduate work until 1934. 

Mr. de Rosa’s professional work began in 1931, as an engineer with the DeForest 
Tube Company, Newark, N. J. In 1932, he became identified with Electrad, Inc., New 
York City, as a research engineer on radio components, receivers, and amplifiers. From 
1934 to 1937, Mr. de Rosa was chief engineer of Electrotechnical Laboratories, New York 
City, where he engaged in work on electronics, remote control, and recording. In 1937, 
he became chief engineer with the Electrad Division of P. R. Mallory Co., Indianapolis, 
Ind., where he worked for a year on photoelastic research and special devices. The follow- 
ing year, while at Indianapolis, Mr. de Rosa devoted individual research to physio- 
psychological acoustics. From 1939 until 1942, he occupied a position as staff engineer 
with the Electronics Research Laboratories of the National Cash Register Co., Dayton, 
Ohio. Here his work was concerned with magnetic-materials research, electronic com- 
puters, and pulse communication. 

From 1942 until the present time, Mr. de Rosa has been with the Federal Telecom- 
munication Laboratories, Nutley, N. J. First as a senior project engineer, and later as a 
department head, he was engaged in research work on aerial navigation, direction finders, 
and automatic-landing equipment. From 1945 until 1953, he served as a division head in 
charge of acoustical research and electronic countermeasures work. Since 1953, Mr. 
de Rosa has been head of the Electronic Countermeasures Laboratory, which now com- 
prises a group of departments concerned with research and development in the fields of 
active and passive electronic countermeasures systems, antenna research, data processing, 
and communication theory. 

Mr. de Rosa has presented various technical papers at conventions and meetings of 
the Institute of Radio Engineers, the Radio Club of America, and the Acoustical Society 
of America. He holds about forty patents issued in the fields of direction finding, radar, 
antennas, communications, and components. 

A Fellow of the Institute of Radio Engineers and chairman of its Professional Group 
on Information Theory, Mr. de Rosa also holds membership in the American Physical 
Society and the Acoustical Society of America. 
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Za 
In Which Fields Do We Graze? 


L. A. DE ROSA 


Chairman, Professional Group on Information Theory 


Za 


if | NHE EXPANSION of the applications of Information Theory to fields other than 
radio and wired communications has been so rapid that oftentimes the bounds 
within which the Professional Group interests lie are questioned. Should an at- 
tempt be made to extend our interests to such fields as management, biology, psychology, 
and linguistic theory, or should the concentration be strictly in the direction of com- 
munication by radio or wire? 

To make one’s interest the formulation and extension of the general theory of infor- 
mation, and then, having armed oneself with such a universal and powerful tool, to 
consider only those applications which deal with radio and wire communication, is an 
attitude which has been criticized by a number of our members. 

Other Professional Groups whose interests lie in more sharply defined fields must, 
perforce, consider the application of Information Theory to their respective fields; other- 
wise, the benefits which may accrue through the extension of Information Theory to these 
various specialized fields might occur belatedly, or not at all. 

Some of our members argue that should the application of Information Theory to 
other specialized fields be left to their specialists and the interests of PGIT not extend 
to fields other than radio and wire communication, then PGIT would be a purely aca- 
demic and theoretical group with no interests in any but the general, universally applic- 
able, mathematical procedures. 

We have heard the opposite views expressed also, namely that PGIT should encour- 
age the extension of the theorems to other general fields and broaden the scope of PGIT 
to include the interests of Psychology, Biology, and other branches of the ‘Arts and 
Sciences.”’ In so doing, it is argued, PGIT becomes a creative group in advancing the 
theory of information and in assisting other Professional Groups. Thus, by disseminating 
information of other fields which may be required for the over-all solution of the problem 
of communication from one subjective sensory terminal to another (the over-all ‘“brain- 
to-brain” terminals), a raison d’étre is established for us. 

At least one more group feels that PGIT should confine itself to adapting the generic 
developments of Information Theory to the specific field of radio, electronics, and wire 
communication, foregoing all ties with computers, television, telemetry, management, 
automation, or circuit theory. 

It would be interesting to obtain the views of PGIT members with regard to the 
proper bounds of our interests and activities, for without such expression, proper direction 
cannot be achieved. 
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Theory of Noise in a Correlation Detector’ 


M. HOROWITZ AND A. A. JOHNSONT 


Summary—tThe problem of detecting a signal that has both 
magnitude and sign is considered. A new type of correlation device, 
which employs the derivative of the correlation function, is proposed 
for measuring target range and position, and an analysis to de- 
termine the most useful waveform is made. The effects of white 
noise accompanying the input signal are minimized when an input 
waveform of long duration, wide bandwidth, and high derivative 
power is chosen. 


INTRODUCTION 


T HAS BEEN recognized for some time that corre- 
[ lation detectors may be used to detect signals 
contaminated by noise. Fano’ has studied the 
particular case of a correlation detector consisting of a 
multiplier followed by a low-pass filter. Goldman’ has 
summarized the work of Lee® in using cross-correlation 
for detecting repetitive signals. Woodward,* in his simple 
theory of radar reception, employs the correlation function. 
It is the purpose of this paper to extend the above- 
mentioned earlier work in order to deal with the problem 
of detecting a vector signal; 7.e., one that has both magni- 
tude and direction. In particular, whereas Woodward has 
considered the problem of measuring the scalar range to 
a target by analysis of transmitted and received wave- 
forms, we shall be concerned not only with measuring the 
range to the target but also with finding out whether the 
target is in front or to the rear of the receiver’s position. 


A SprecitAL TYPE OF CORRELATION DETECTOR 


We consider a real wave form f(t) where f is amplitude 
and t is time. The function f(¢) is assumed to be a station- 
ary time series; in this case, as Wiener’ has remarked, 
f(t) possesses an autocorrelation function 

T 
o(7) = lim sn [fost + 2) at. 
T0 2T =o 

Since f(t) is assumed real, ¢(7) is a real symmetric 

function of 7, as is noted by Bartlett.° The first derivative 


of ¢(r) with respect to 7, d¢/dr, will hence be an odd 
function; this is shown, for example, by Indjoudjian.’ 


* Received by PGIT October 10, 1955. 

+ Aerophysics Depts., Goodyear ’Aircraft Corp., Akron, Ohio. 

1R. M. Fano, “Signal- to-noise ratio in correlation detectors, ae 
M.I.T. Res. Lab. Elec. Tech. Rep., No. 186; February, 1951. 


2S. Goldman, “Information Theory,” Prentice-Hall, Inc., New 
York, N. Y. »P 279, 1953. 
3Y.W. Lee, ‘Application of statistical methods to communication 


problems,” M71. T. Res. Elec. Tech. Rep., No. 181; 1950. 

‘P.M. Woodward, ‘Probability and Information Theory, with 
Applications to Radar,” McGraw-Hill Book Co., Inc., New York, 
NESY.., p. 82, 1953. 

5N. Wiener, e oxtrapolation, Interpolation, and Smoothing of 
Stationary Time Series,” John Wiley and Sons, Inc., New York, 
N. Y., p. 18; 1949. 

‘M.S. Bartlett, ‘““An Introduction to Stochastic Processes,” 
Cambridge University Press, Cambridge, England, p. 160; 1955. 

7L. de Broglie et al., “La Cybernetique,”’ Editions de la Revue 
d’Optique, Paris, France, p. 49; 1951. 


We now define a special type of correlation detector 
that computes a time shift 7 by means of the function 


alt) =f OMe + 9 at, (1) 


where 7 is the averaging time, 7 is a time shift whose 
magnitude and sign are to be determined, and prime 
denotes differentiation with respect to 1. 

It is assumed that f(t) possesses finite derivatives so 
that f(¢ + 7) may be expanded in a Taylor series; it is 
assumed the series converges. In this case, we find, from 


(1), that 
= ae i ror as} ze 


Ole 
i 
For sufficiently large 7’ and small 7 we have approximately 


ai) = oth [” urcor ath (2) 


a(r) is thus essentially an odd function for small 7 and 
large 7’ and has the form shown in Fig. 1. The odd 


= = [fOr 
a(r) = 5) 


DETECTOR 
OUTPUT 


TIMES Al Etsy, 


Fig. 1 
Form of output function. 


character of a(r) follows also from the fact that do/dr 
is odd and that a(r) will not differ much from d¢/dr 
when 7' is large and 7 is small. Then by (1) and (2), for 
small 7 and sufficiently large 7, 


1 “Av 
rao | POIE4+ D di, @) 
where P’ is the average power in f’(t); that is to say, 


Prac (or ae 


We therefore define the output signal in the noise-free 
case as 


() = gor ff Ose + 9 at. (4) 
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When the waveform f(t) is accompanied by input noise 
n(t), the output of the detector will be s(7) plus output 
noise. The object of our analysis is to find a suitable 
wave form f(t) so that the signal s(7) can be most readily 
extracted from the output of the detector. 


EVALUATION OF OuTPUT SIGNAL-TO-NOISE POWER 
Let us now assume that a noise n(¢) accompanies the 
function f(t + 7) in (4) so that the input to the correlation 
detector is 


git + 7) = f(tt+ 7) + nfd) 


instead of f(t + 7). Substituting the above expression for 
g(t + 7) in the place of the f(¢ + 7) in the right-hand 
side of (4), we find that the output of the correlation 
detector takes the form 


1 aie ike Se 
ap | OMe Dat + pp [ Onoda. © 
Hence, the output noise is given by 
PN Res 
Ne = ppr | snl dt. (6 


We now seek a suitable representation of f’(t) and 
n(t). We have chosen f’(¢) and n(t) as functions with 
band-limited spectra to take advantage of the properties 
of the Whittaker cardinal functions.*~*° Furthermore, we 
define 


fi = me eevee 
0 elsewhere. 
Then 
2TW k 
pane & (se uld, * 


where f’(¢) is limited to the band 0 to W eps and 


inal 8 
IES Fe Toa a 


We now consider white noise as described by Shannon,” 


- k \sin x(2W,t — k) 
AON a2 sie) QW.) ” 
with the n(k/2W,,) normal and independent all with the 
same standard deviation N. This equation is a repre- 
sentation of white noise, band-limited to the band 0 to 
W,, cycles per second and with average power V. We now 
assume W, = W. This assumption is possible since in 


k=-—o@ 


8 &. T. Whittaker, “‘On the functions which are represented by the 
expansions of the interpolation theory,” Univ. of Edinburgh Math. 
Dept. Res., Paper No. 8; 1915. 

9J. M. Whittaker, ‘“Interpolatory function theory,’’ Cambridge 
Tracts in Math. and Math. Phys., Tract No. 33, Cambridge Univer- 
sity Press, Cambridge, England, 1935. 

10D. Gabor, ‘Summary of Communication Theory” from ‘‘Com- 
munication Theory: Papers Read at Symposium on Applications of 
Communication Theory, London, September 22-26, 1952’ (Willis 
Jackson, ed.), Academic Press, New York, N. Y., p. 5, 1953. 

1.0, E. Shannon, ‘‘A mathematical theory of communication,” 
Bell Sys. Tech. Jour., vol. 27, p. 379; July, 1948; p. 623, October, 1948. 


Decembe 


either case we may choose the greater limit. Then 


ni) = ae nH Naud. (8 
Hence, 
Ne = ap) is ies (se )uco | nf tuto | dt. 
(9 


By the orthonormality relation, 


ih Uz.U, dt = Ont oW ) 


we have 


Mag sone ih 
N. = swrp’ 2S GaGa, o 
We wish to know the ensemble average (N>)4y wher 


/(t) is kept fixed and n(t) varies over the ensemble o 
noise functions. We have 


we = apne (LE or)ow) | * 


(2 Grr) ).. 


Because the noise is white and Gaussian, 


(ceo) = ms 


The integral defined above, ! 


Pm af [roy ae, | 


may be approximated by the series 


1 2TW k 2 ' 

1d (citar J 

2TW > E Gal “ 

assuming 27'W is large. 
Hence, approximately, i 


N 
2 1 gare aa 
(Noav = orp 


CONCLUSIONS 


When s(r) + No is small, we may interpret this quantit) 
as t + Ar where Az is the error in the measurement of } 
due to the noise. Then in the case of white Gaussian nois’ 

f 


P ty N iG 
((Ar )av) = leet ee ‘ 


If the perfect integrator is replaced by a low-pass filter 
then T may, as an approximation, be identified with twice 
the time constant of the filter. In that case 


1 
(14 
I 


C 


((A7)*)ay)'? = . 


(noise power per unit signal bandwidth) x |} 
(bandwidth of the low-pass filter). 


average-signal derivative power 
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It will be noted that N/W, or noise per unit bandwidth, 
is constant with W for white noise whereas the average- 
signal derivative power increases with W. Hence, a wide 
signal bandwidth is beneficial provided s(r) + WN, is 
small. 

By the radar relation r = 43cr, where r is the range of 
the target, one may use the proposed correlation detector 
for small signal and noise to determine the range and 
direction of a target. In such a detector it follows from 
the above analysis that in the presence of white Gaussian 
noise a narrow detector bandwidth and a signal with a 
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wide bandwidth would be beneficial. Although filtering 
alters the spectrum of the noise, it can be shown that a 
linear filter would not be beneficial. The proof of this 
fact lies beyond the scope of this paper. 

These results have been restricted to the class of real 
functions f(t), which are stationary time series having 
finite amplitudes, and whose derivatives exist and are 
also finite. It is also assumed that f(t) is examined by 
the correlation detector for a period 7 of long duration. 
The Taylor series expansion of f(¢ + 7) is assumed to 
converge. 


Optimum Sequential Detection of Signals in Noise” 


J. J. BUSSGANG} AND D. MIDDLETONt 


Summary—A device which performs a sequential test on a mix- 

ture of signal and noise is called a Sequential Detector. With such 
a device, two thresholds are introduced, each of which is associated 
with a terminal decision. The length of the detection process (inte- 
gration time) is not fixed in advance of the experiment but is a 
random variable, depending on the progress of the test. An optimum 
form of such a test exists and is characterized by the fact that de- 
tection is performed on the average faster than with conventional; 
i.e., fixed sample size (optimum or non-optimum), devices. The 
sequential analysis developed by A. Wald is fully applied in this 
paper, but an important new feature is the treatment of correlated 
samples and its application to continuous sampling processes. 
| In the introduction, the problem is presented within the frame- 
work of Wald’s Statistical Decision Theory, and the optimum proper- 
ties of sequential detectors are discussed accordingly. It is pointed 
out that a sequential detector is defined in terms of conditional 
probabilities and hence its operation is essentially independent of 
a priori information, although the average risk or cost of detection 
necessarily depends on the a priori signal data. The general theory 
is illustrated with some cases of special interest. 
i The simplest example of detection involves independent, discrete 
observations; e.g., the case of a pulsed carrier in normal noise. Here 
the optimum detector still has the well-known log J, structure, but it 
is shown that the square law approximation for weak signals requires 
a bias correction due to the fourth order term. Coherent sequential 
detection of causal signals in normal noise provides another illus- 
tration of the theory. An interesting result is that the probabilities 
of error do not depend on the shape of the filter, provided the proper 
computer is used. The use of RC-filtered noise illustrates the treat- 
ment of continuous detection processes. Finally, the reduction in 
minimum detectable signal level resulting from the use of a sequen- 
tial detector is computed. A third example is the sequential detection 
of random signals in normal noise. It is shown that, although the 
optimum computer involves the knowledge of the inverted correla- 
tion matrix, the average length of the test does not. Hence a curious 
result is obtained that in*this instance detection can be performed in 
an arbitrarily short time. The paper concludes with a discussion of 
the practical necessity of truncating the detection process and exact 
expressions for the error probabilities of such truncated tests are 
derived and compared with Wald’s original approximations. 


* This work was supported in part at Crufts Lab., Harvard Uni- 
versity, under a contract with O.N.R., the Signal Corps and the 
U.S. Air Force. A part of this paper was presented at 1955 Wescon. 

+ RCA Aviation Systems Lab., Waltham, Mass. 

{49 Lexington Avenue, Cambridge, Mass. 


I. AN OUTLINE OF THE GENERAL THEORY OF THE BINARY 
SEQUENTIAL Drrecrion Procrss 


Introduction 


T HAS BEEN recognized in recent years that the 
problem of detection of signals in noise, in its fullest 
sense, should be viewed as a test of statistical 

hypotheses [1-8]. If it is important to speed up the 
detection process, one is led to consider sequential tests, 
because of their optimum nature. The procedure for such 
tests is to introduce two thresholds at the detector output 
such that the signal is declared present if one, and absent 
if the other of them is exceeded. The length of the de- 
tection process (integration time) is not fixed in advance, 
but is a random variable depending on the progress of 
the test. Application of sequential tests to detection 
problems is here termed Sequential Detection. The 
device which performs the test is called. a Sequential 
Detector. The major feature of these devices is that they 
minimize the average detection time. 

In the last ten years, many contributions have been 
made to the statistical theory of sequential tests [9-14] 
and the associated analysis has been extensively de- 
veloped. More recently, sequential procedures have been 
applied to communication problems [7, 15-22]. 

To place sequential detection in proper perspective, 
we begin by locating it within the general domain of 
Statistical Decision Theory, as applied to reception, 
following Middleton and Van Meter [23, 24]. Our atten- 
tion is then directed to simple, binary (?.e., two-decision) 
detection and an outline of its theory. The mathematical 
foundations on which it is based are due principally to A. 
Wald. An application to detection is presented in Part I, 
which describes the features of the theory of interest to 
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us here, while Part II illustrates these rather general 
remarks with important examples encountered in practice. 
One new feature of the treatment is the handling of 
correlated samples and continuous sampling processes. 


Statistical Decision Problem. 


The Statistical Decision Problem has been formulated 
only recently [25-28] in a form broad enough to include: 
1) the state of a priort knowledge, 2) losses associated 
with a false outcome, and 3) costs of experimentation. 
The problem is considered in relation to a sequence of 
random variables, say, 2, 22, -, denoted by {2;,}, 
which may represent e.g., successive voltage measurements 
at the detector output. In general, the joint probability 
distribution of {2,;} is not known exactly. Typically, the 
form of the distribution function will be known except 
for a set of parameters {4}. These parameters may be, 
e.g., the de level, or the rms value of the voltage examined. 
Thus while the different 2; may be known to be independ- 
ent of each other and the distribution gaussian, the 
associated mean and variance may be unknown. The 
problem of a statistical decision arises when we are to 
select one of the mutually exclusive actions, A,, Az, ---, 
A,, the degree of preference for which depends on the 
true value of the set of parameters {6}. These actions 
may be the initiation of tracking, resumption of search, 
ete. 

As a first step, one may begin by considering a parameter 
space II which has the dimensionality of the number of 
unknown parameters (in the example above, two: mean 
and variance). To each possible set of values of the 
unknown parameters {6} there corresponds a point in 
the space II. Before the experiment begins, the parameter 
space is divided into zones: 7, m2, °--, 7, such that if the 
point defined by |6@} were in z;, the action A; would be 
preferred. These zones cannot be overlapping, because 
the possible actions are mutually exclusive. 

The second step, after this subdivision of the parameter 
space IT, is to determine from the experiment in which zone 
of the parameter space II the value {6} falls. Let H, 
denote the hypothesis that {6} lies in z,. In order to test 
which hypothesis is correct, one consideres the sample 
space >. defined by the totality of possible samples. The 
sample space has the dimensionality of the number of 
observations in the sample. To each set of possible values 
of {x,;} there corresponds a point in the sample space >>. 
This space is in turn subdivided into zones o,, 2, 
such that H; is to be accepted if, and only if, {x} fall in o;. 

A statistical decision problem can now be defined as the 
problem of selecting the zones oy, -, o, im the sample 
space a given the zones m,, «++, 7, in the parameter space 
II. The criterion used in this selection is, to a certain extent, 
a matter of arbitrary choice for the experimenter. The 
method advanced by Wald concerns itself with minimizing 
a certain function, said to be a measure of the risks or 
costs involved in making a decision. We now outline the 
details of this approach. 


On Oi 


December 


Minimum Average Risk Criterion. 


Suppose, for the sake of simplicity, that the distribution 
of {x} has only one unknown parameter 6; e.g., the signal- 
to-noise ratio. The parameter space is now a line sub- 
divided into segments. Let L;(@) be the probability of 
rejecting hypothesis H; when @ is true, and suppose 
C’,(@) is the loss caused by failure to take action A; when 
it should have been taken. If we agree beforehand neither 
to reward nor to penalize correct decisions, then C;(@) = 0 
when @ falls outside z,;. The expected loss due to a terminal 
decision is then >>, L,(@)C,(@). Now let n(6) be the 
number of observations, when @ is true, and let the cost 
of each observation be c. The conditional risk r is then 
defined by 


(8) = D7 L,(8)C.(8) + ch [n(6)] (1) 


I 


in which /[n(@)] denotes the expected value of n(@). 
Let g; be the a priort probability that H, is true 
(dog; = 1). The average risk R is accordingly defined by 


R= DOD gihs(G)C(0) +e DX, g:Hn(O)] 


in which for the sake of simplicity we let @ assume only 
discrete values 6;, +--+, 9;, ++: (this restriction is not 
essential). 

Given the cost of experimentation and the loss function 
C’, it may be possible to devise a test which is superior to 
all other tests in that it entails the smallest risk (usually 
the smallest average risk). Among such optimum tests 
are: a Minimax test, for which the maximum value of the 
conditional risk r is the least, and a Bayes test, for which 
the average risk R, with respect to a priort probabilities, 
is the least. Under certain very weak conditions, a Minimax 
solution is equivalent to the Bayes solution relative to the 
least favorable a priori distribution [25]. Applications of a 
general formulation, outlined above, to communication 
problems have been studied by Middleton and Van 
Meter [23, 24]. Here, however, we are interested principally 
in the problem of sequential detection, which is an im- 
portant aspect of the general theory, and which explicitly 
introduces the notion of variable sample size or observation 
periods. 


Minimum Risk Formulation of the Detection Problem 


In the simplest form of practical interest, detection 
involves basically a binary (7.e., two-decision) choice 
between the hypothesis H,: signal is absent (only noise 
is present) and the hypothesis H,: signal as well as noise 
is present. The parameter for which the test is carried out 
will be, as a rule, the input signal-to-noise ratio, which is 
assumed here to remain unchanged during the entire test. 
The particular value of the input signal-to-noise ratio 
which is actually the true one will be denoted by a. Our 
test, then, has to distinguish between the hypothesis Hy 
that a = 0 and the alternative hypothesis H, that a has 
some specified value a,. Acceptance of H, is known as an 
alarm (action A,). By analogy, we shall call here the 


divided into three zones: 09”, o; 
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acceptance of H) a dismissal (action Ay). The probability 
of a false alarm (alarm when a = 0) is usually designated 
by @ and the probability of a false dismissal (dismissal 
when a = a,) by 6. The quantities a and @ are referred to 
as probabilities of errors of the first and second kind, 


| respectively, just as in the fixed sample size tests. We also 
_ term (a, 8) the strength of the test. The smaller a and 8 the 
stronger is the test. 


Let p be the a priori probability of signal and q be the 
a priori probability of no signal, so that p + q = 1. We 


can now write risks of (1) and (2) as (conditional risk) 


r(a) = aC,(a) + BC,(a) + cHjn(a)] (3) 
and (the average risk) 
f= qiaC,(0) + cH [n(0)]} + p{BCi(a.) + cH [n(a,)]}}. (4) 


Notice the relation between the average risk and Siegert’s 
betting function [8] defined by 


Saw = (gai--* ps), (5) 


which is sometimes specified as a measure of probable 
-success. The quantity 1 — S is a special case of the average 


risk discussed above [c = 0, C,(0) = C,(a,) = I]. 

Wald and Wolfowitz have shown [10] that whatever 
the assigned probabilities of error (a, 8), costs (Co, C4, c) 
and a priori probabilities (p, q), no test will produce smaller 


average sample numbers, E|n(O)| and E[n(a,)], than the 


sequential probability ratio test. Consequently, it follows 


from (3) and (4) that the sequential probability ratio test 
minimizes the risks. 


The procedure for the sequential probability ratio test is 
outlined next. 


Sequential Test Procedure 


A sequential test proceeds in successive stages. At each 


stage of the experimentation, the sample space >>‘”” is 


(m) (m) (m), 


, and o””’; the super- 
script ‘” is used to indicate the m-th stage. The test is 
terminated in a dismissal if the sample falls in o,”’, and 
in an alarm, if it falls in o;”. If the sample falls in o\”” 
(m) : 5 C 

is called the zone of in- 


the test continues. Here o 


difference or the test zone. It is characteristic of sequential 


tests that zones of indifference separate zones of acceptance. 
In the basic case considered here, each stage consists of 
one observation. A terminal decision is reached at some 


stage n; nis, of course, a random variable, when considered 


over the ensemble of possible runs. Now let w,,(x; a)dx be 


the probability of obtaining a sample (a, ---, 2,,) when a 
is true, and consider the likelihood ratio. 
AN aa PW ,AX; a;) /[QWmn(X; 0) | (6) 


1 Tn the more general case, the signal can assume a continuum of 
values and the a priori distribution of signal, say, p(a) must be given 
The test procedure is then defined in terms of the ratio 


AP he | W(X; a)p(a) ia | / tawals; 0) (6a) 


rather than in terms of (6). For a systematic treatment, including 
the a priori distribution, see [24] (for fixed sample-size). 
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The division of the sample space is accomplished by 
selecting two numbers A’ and B’ (A’ > B’) and establish- 
ing following rules of procedure: The test continues while 


Vo HE alee AV dae ny) (7) 
and terminates in acceptance of Hp, if at the nth trial 
pw,(x; a) < B’qu,(x; 0), (8) 
or in acceptance of H,, if at the nth trial 
pw,(x; a1) > A’qu,(x; 0). (9) 


In order to establish the connection between A’, B’ and 
a, 8 we consider the probability measure of each side of 
(8) over the totality of samples leading to acceptance of 
H,. This leads to the inequality [9] 


pB < B’q(l — a). (10) 
Similarly from (9), we obtain 
pl — 8) 2 A’ga. (11) 


Let us set B’ = (p/q)B and A’ = (p/q)A. Next, con- 
sider the probability ratio \,, = w,(x; a.) /w,,(%; O) (note 
that A, = (p/q)\,,). The test procedure can now be 
defined on X,,, as follows: continue sampling while B < 
\n < Ay m = 1) -3-,.0 = Seacceptyi jal Ab pane 
accept H, if \, > A. A and B are termed the upper 
and lower boundaries of the test. It is usually assumed 
that \, does not exceed boundaries by an appreciable 
amount, especially when n tends to be large. Neglecting 
the so-called ‘‘excess over boundaries” [9, p. 44], we have 
from (10) and (11): 


A= (1— B)/a and B= B/1 —a) 


(in all that follows a and 6 are assumed less than 3; this 
assures us that A > 1 and B < 1). 

A simple reasoning in support of (12) is given; e.g., by 
Mood [29]. If sampling is continuous, the relation (12) is 
exact. From (12) it follows that 


w= (1 — BA — B)andp = BA — N/A = BGs) 


It is important to notice that the ease with which A and 
B are related to a and B is entirely due to the fact that the 
detector used is a probability ratio detector. It is also im- 
portant to notice that the sequential test is set in terms 
of conditional probabilities (probability of error 7f signal 
is present and probability of error af signal is absent). 
Thus the sequential test can be constructed without using 
p and q; it is independent of a priori probabilities. The 
average risk R, of course, always depends on a priore 
probabilities. Hence p and q are needed when a, is defined 
as the least value of signal-to-noise ratio for which the 
average risk (given p, g, and the costs) does not exceed 
some specified number. This number would be the 
maximum acceptable average risk and a, would be corres- 
pondingly called in this connection the minimum signal- 
to-noise ratio. For any signal-to-noise ratio larger than a, 
the resulting average risk would be acceptable. 


(12) 
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The question of whether a sequential test will terminate 
is of fundamental importance. Stein [12] showed that all 
moments of n will be finite if observations are independent. 
Moreover, Wald has pointed out [9, p. 438] that the 
probability 7s unity that the test will eventually terminate, 
for a very large class of joint distributions when observations 
are not independent. That this is so, granted Stein’s result, 
follows from physical considerations. For any band- 
limited process (with no de or purely periodic component) 
it is possible by taking observations far enough apart in 
time to produce independent samples. In addition, it may 
be pointed out that if the expected length of a sequential 
test of strength (a, 8) were not finite, a classical test of 
this strength would not be realizable. This is a direct 
consequence of the optimum nature of sequential tests. 
It must be recognized, however, that some individual 
sequential tests may take a very long time to terminate. 
If such a feature were undesirable, one would have to 
resort to some new rule for terminating after a certain 
stage has been reached. Among such possible solutions is 
the truncation procedure discussed in Part II. 


Handling of A-Priort Information 


The setting up of a problem depends on the amount of 
a priort information available. Three general situations 
are recognized: 

Case 1: Value of Signal Known Exactly. Suppose the 
signal is known to occur only at a specific value, if at all. 
Then we are dealing with two simple alternatives, for 
which the sequential test procedure is that described 
above. This is the simplest possible situation. 

Case 2: Distribution of Signals Known Exactly. Suppose 
p(a), the a priori distribution of signals, is known exactly. 
Case | is a specialized form of Case 2 with p(a) = pé(a — 
a,). The test procedure is now defined in terms of the 
weighted probability of a sample; 7.e., f,, W,(x, a) p(a) 
da dx, where 7, includes all the possible non-zero signal 
values; (see footnote reference 1). In place of the prob- 
ability of false dismissals p8, we now have an average 
probability of false dismissals 6, defined by 


8 = / da p(a) / a(n | do: a) / w,(x; a) dx (14) 
Jy, JO J go (nr) 


in which 


a+ [ pla) da = 1. (15) 


and p(nido; a) is the conditional distribution of n if Ho is 
accepted and a is true. 

From considerations similar to those in the preceding 
section, it can be shown that A’ and B’ are approximately 
given by A’ = (1 — 8)/(qa) and B’ = B/[q(1 — a@)]. The 
boundaries of the test are still determined by a and @ in 
a simple fashion. Whether the distribution of the sample 
size can be derived will depend entirely on how complex 
is the form of A,, for the specific p(a). 
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Case 3: Distribution of Signals Known Incompletely., 
This is perhaps the most common case in a practical situa- 
tion. The procedure here is to select an @ and to fix 6 at 
a suitable signal level (a = a,). Then one constructs the 
test as if one were dealing with a simple alternative case. 
With this method only the choice of a, involves any @ 
priort knowledge. Notice that although the test is now 
constructed as if signal-to-noise ratios 0 and a, were the! 
only possible alternatives, the actual signal-to-noise 
ratio, say a, will be in general different from either of these 
two alternatives. 

The probability of a terminal decision ‘‘to dismiss” is a 
function of both a and a,, and is denoted by L(a). 

Before we examine L(a), let us emphasize that sequential 
tests pose similar problems with respect to the handling 
of and the requirements for a priori information as do 
classical tests; e.g., both Minimax and Bayes sequential 
tests are possible under suitable circumstances. | 


Operating Characteristic Function (OCF) 


The probability L(a) of accepting H, at the end of a test 
is termed by Wald the Operating Characteristic Function 
(OCF) of the test [9]. As a rule, L(a) decreases as a in-. 
creases. Because we consider a test which is certain to 
terminate, the probability of accepting H, is given by. 
1 — L(a). For independent observations, Wald has_ 
developed a method of finding L(a) with the aid of a 
parametric equation. This method can be extended to— 
the case of correlated samples. | 

Suppose we are testing hypothesis H, that w,,(x; a,) is 
the distribution of x (signal present) against hypothesis 
H, that the distribution is w,,(x; O) (signal absent). Let the 
upper threshold be A and the lower B. The (conditional) 
probability ratio of this test is 


Am = Wm(X} 1) /Wy(X5 0). (16) 


Following procedure described in last paragraph, we accept 
H, (declare signal absent) whenever \,, < B. Construct 
now \,, , where h = h(a, a,) satisfies the condition 


/ toe i| Noa a)raxe =e 


his then a parameter depending on the distribution of the 
observed variables. From the last equation it follows that 
fn(X) = A,w,,(X; a) is, quite formally, some distribution of 
x. If we next consider a new sequential test, whose object is 
to decide whether f,,(x) or w,,(x; @) is the true distribution 
of x, the probability ratio of this new test is \;.. Letting 
the boundaries of the new test be C and D, we see that 
the probability of declaring w,,(x; a) the true distribution 
when it indeed is true, is from (13) 


ee (ee (ED), 


(17) 


(18) 


Let us now select C = A” and D = B* (h > 0); then 
the new test has the property that w,,(x; a) is declared true 
(A, < B") whenever H, is declared true in the original 
test (\,, < B). Hence the probability of accepting H, when 
W(X; a) is true is, from (18), (A* — 1)/(A* — B). The 


19565 


same relation can be shown to hold for h negative. Hence 
we get the Operating Characteristic Function 
Tae As = Ni/( ALB 


in which h satisfies the relation 


(19) 


co 


i 


/ [Wn (X)a1)/Wa(x; 0)” w,,(x; a) dx = 1. 


(20) 

For independent observations, we have w,,(x; a) = 
[w, (x; a)]” and (20) reduces to the form 

if [w,(x; ay) /w,(r; 0) w,(x; 0) dx = 1. (21) 

It is easy to verify that h(O) = 1 and h(a,) = — 1. Also 


we have L(O) = 1 — aand L(a,) = 8, as required. 
The value of a at which h = 0 will be denoted by a’. A 
limiting process has to be applied at this point: 


L(a’) = log A/log (A/B) (22) 


o (a) = (log A)(log B)/[2 log (A/B)]. (23) 


When a = @6 (and hence A = 1/8), we get from (19) 


ay = 17 At): 


The OCF of the test is not only important in its own 
right but it is also needed for the computation of risks 
and most important of all, it allows the evaluation of the 
Average Sample Number of the test. 


(24) 


Average Sample Number (ASN) 


The sequential test procedure has been discussed in 
terms of the probability ratio d,, = w,,(x; a,)/w,,(x; 0). 
It is convenient now to consider log X,,, which we denote 
DY Lins UsCs5 


Tees One| WAX 1) Ax 0) |e 


The test procedure can be stated in terms of Z,,, as 
follows: Construct Z,, at each state of the experiment. 
Continue testing while 


Ree <o 1Og AE i ND B= Le (26) 


Terminate by accepting H, when Z, < log B. Terminate 
by accepting H, when Z, > log A. _ 

Since the average value of Z,, say Z,, must be equal to 
the value of the bounds, weighted by the probability of 
reaching them, we have 


Z, = Lia) log B + [1 — L(a)] log A. (27) 
>o7.1 2:, where 
(28) 


Now for independent observations Z,, = 
2 = log (wie a) / Wee. 0))|. 


Let 2 denote the Average Sample Number (average 
number of observations required to terminate the test); 
then for independent observations Z, = 7z, and by 
virtue of (27), one gets Wald’s result [9] for the average 
sample size when the input signal-to-noise (rms amplitude) 
ratio 1s a 
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n(a) = {L(a) log B + [1 — L(a)] log A}/e(a); (29) 


(n(a) is used as an alternative notation for E[n(a)].) For 
correlated observations it may be still possible to obtain 
a simple relation for % from (27). The condition is that 
Z,, be expressible as a linear function of 2. Specific examples 
are taken up in Part II. If we let 


(Ka, B) = (1 — aloe B- Glog Ay (30) 


we can express n(0), the ASN in absence of signal, and 
n(a,), the ASN in presence of signal, as 


n(0) = gla, B)/2), and nla) = —9(B,0)/2e(m). (1) 


This means that n(a,) (k = 0, 1) can be split up into two 
factors, one of which depends only on the probabilities of 
error (a, 8) and the other only on the distribution function 
of the observed variable and on the value a,. Hence, if 
two detectors of the same strength (a, 8) are to be com- 
pared by their required sample sizes, only the quantities 
Z need be examined. The auxiliary function g(a, 8) is 
plotted on Fig. 1. 


-g (a ,B) 
oa 


) = 
fe) Siow! 102 


10 3 


a 
@- PROBABILITY OF FALSE ALARMS 
8 —PROBABILITY OF FALSE DISMISSALS 


g(a,B8)=(|-a) log B+@log A 
A=B /\\-a) B=(1-B)/a 


Fig. 1—The auxiliary function g(a, 8). 


Notice that at a/(h = 0), the numerator of (29) becomes 
zero, since z(a’) = 0.” The relation (29) is therefore in- 
determinate at a = a’. In order to find n(a’), the ASN 
when a = a’, we consider the expected value of Zi. At 
termination, Z;, must be equal to the square of either one 
or the other bound. Taking account of probabilities with 


which the test terminates in either decision, we get 


Z = Lilog B)? + (1 — L)(log A)’. (32) 
2Wq. (21) defining A can also be written fe, evw(z) dz = 1. 
Differentiating both sides with respect to h and setting h = 0, 


(a = a’), we get 2(a’) = 0. 


10 


Now for independent observations, if 2 = 0, we have 


Zz = nz’, (33) 
so that recalling (22) we obtain 
n(a’) = —(log A)(log B)/2*(a’), (34) 


in which log B is negative (B < 1), and consequently 
n> 0. 

In the important special case of equal probabilities of 
error (a = 8; hence A = 1/B), relation (29) becomes 


iG) = 


—(log A){tanh (4h log A)]/z(a). (35) 
[The negative sign is accounted for by the fact that h and 
Z always have opposite signs. | 

For small a and £ one gets simply the limiting forms: 


n(a,) — (log A)/z(a,) and n(0) — (log B)/z(0). (36) 
Continuous Detection 


In many communication problems detection is to be 
carried out as a continuous process; 7.e., instead of a 
series of discrete sample values the data is observed con- 
tinuously. The theory of sequential detection can be 
extended in a natural manner to apply to such continuous 
sampling, if the noise is Gaussian. 

Consider observations spaced uniformly at times At 
apart. The expression for Z,, will in general involve a 
summation, which because of correlation between obser- 
vations depends on At. One can let At — 0, keeping at the 
same time mAt = T' finite, to obtain a detector output 77. 
The rules of procedure stated for Z,,, in the last section now 
apply to Z,. Usually Z, requires an integration (in place 
of summation), in which 7' appears as the upper bound 
of integration and 0 as the lower one. Suppose that the 
test terminates at some time 3, corresponding to nAt. 
Then 3 is a random variable, inasmuch as n is. The 
probability L(a) of accepting H, (no signal) at the end of 
the test can still be introduced. One obtains the funda- 
mental relation 


Z; = Llog B+ (1 — L) log A. (37) 


From this relation the Average Sample Length (cf. 
with 7) can be obtained, provided Z; is expressible in 
terms of 3. 

An alternative method of treating correlated samples 
of Gaussian noise involves the choice of a proper linear 
transformation leading to a new set of variables y;, yo, °° -, 
Ym, Such that the new variables are independent and all 
follow the same distribution. Since the test is sequential, 
y; are made independent of all x, for k > 7; this amounts 
to having a triangular transformation matrix and is closely 
related to the problem of inverting the original matrix of 
correlations. An example of this technique is given in the 
second section of Part IT. 

Under certain conditions, the entire distribution of 5, 
rather than just 3, can be obtained [21, 22]. Of particular 
interest is, of course, 5°, the second moment of 3, which 
provides a measure of dispersion. 3 is in effect the first 
passage time of a random walk with absorbing barriers 
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and in addition to the Sequential Analysis can be studied 
by going back to the original differential equations of the: 
process (e.g. [30, 31]. 


II. ExaMpLES OF THE OPTIMUM SEQUENTIAL DETECTION 
OF SIGNALS IN NOISE 


We shall consider here specific distributions of the 
observed variable corresponding to some typical detection 
problems. The examples selected are: incoherent detection 
of a sinewave in normal noise; coherent detection of a 
causal signal in normal noise; and detection of a normal 
noise “signal”? in normal random noise. 


Incoherent Detection of a Sinewave in Noise: 


Let the random variable under inspection be the 


envelope of a narrow-band noise and an additive sinewave 
at the center frequency of the noise band. The noise is | 


assumed to have a Gaussian distribution function with a 


t 


| 
U 


| 


‘ 


' 


zero mean. Under these conditions the distribution of | 
signal-plus-noise envelope is given by the well-known 


expression [32] 


Woe co x>0 
=a x<0, 68 


where X is the amplitude of the output envelope, y is the 
mean square value of noise, A is the peak amplitude of 
the sinewave and J,(w) is the modified Bessel function of | 
the first kind, zeroth order. It is convenient to change the 
notation by letting 


a= X/V2W and a A/V 29; (39) 


a is then simply the square root of the signal-to-noise- 
power ratio. The probability density of x is accordingly 


ae =U 


Qre * * 1(2ax), 
= 0, 


w(x; a) 


Gr<eOr (40): 


Note that to get (88) an average over phase has been_ 


performed corresponding to the a priort knowledge that the - 
distribution of phase angle is uniform (see, e.g., [33, 34]). 
In what follows the observed variable is x, and the obser-- 
vations are assumed independent of one another. The- 


sequence of n observations (voltage measurements) is. 


referred to as a sample of length n. The distribution of 2 
has an unknown parameter a: if a = 0, the signal is. 
absent; if a > 0, a signal is present. We define a random 
variable z as the logarithm of the probability ratio, z 
log [w(v; a,)/w(ax; 0)], so that from (40) 


—ai + log I(2a,2). 


Z4.= 


(41) 


The probability ratio detector constructs z;, corresponding 
to each sample value x; of the received wave. A classical, 
fixed sample test requires the construction of the same 
function z [7]. The difference between classical and 
sequential test procedures arises when instead of adding 
up a fixed number of z,; and checking the sum against a 
single threshold, we now form ene z;, observation by 


) 
| 
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observation, checking this sum against two thresholds at 
each stage m (see Part I, Sequential Test Procedure). 
For weak signals a power series expansion can be made 


| in (41) and we obtain 


z= —ai + aj2x” — laizt + O(abr*). (42) 


In the past, it has been customary to retain only the 
constant and the term in 2”. 

It has been pointed out recently [20-22] that the contri- 
bution of the fourth order term is significant in computing 
2, the first moment of z. However, it is still possible to 
approximate the optimum detector by a square law 
operation on x, provided the bias be modified to include 
4a,*x*, the expected value of the fourth order term. The 
distribution function (40) leads to 2” = 1 + a’ and 
vw = 2! (1 + 2a” + 4a*) (see e.g., [8, 32]). Hence the weak 
signal approximation to the detector of (41) is not 


z2= —a; + ax’ (43) 
but it is 
z= —ai(l + 4a;) + aia’. (44) 
Thus the correct expression for Z is 
2a) = aa — 1qi, Ds << (45) 
leading to 2(0) = — 14, and 2(a,) = 4a‘ and not 2(0) = 0 


and 2(a,) = a; as has sometimes been assumed in the past. 
Notice that if (0) = 0 were true, it would lead to an 
infinite Average Sample Number in absence of signal, 
since we have seen that 7% is inversely proportional to 2. 
An infinite ASN would, of course, contradict the theorem 
that sequential tests terminate in a finite number of 
observations with probability 1 (see the Sequential Test 
Procedure section). 

We emphasize the importance of the correction bias 
due to the fourth order term because this point does not 
seem to have been generally appreciated in the treatment 
of nonsequential detectors. The exact value of bias is 
important if the bias is used as an independent variable. 
In many cases bias is eliminated between the expressions 
for errors of the two kinds. In such cases, the use of the 
square law term alone is good enough, and it does not 
matter whether or not the fourth order term is used. The 
overall conclusion is that depending on the statistics of the 
variable under test, terms higher than the square may be 
important and should in each case be carefully examined. 

A detector constructed according to (41) is optimum 
only for a specific signal-to-noise ratio a,. This is equally 
true of sequential and of nonsequential operations. The 
effect of the true signal-to-noise ratio on the performance 
can be determined from the Operating Characteristic 
Function (OCF) which gives the probability of H, being 
accepted at the end of a test (see Section on OCF in Part 
I). The OCF is, we recall, given by L(a) = (A* — 1)/ 
(A* — B"), where h is now defined by 


2 | WMeConiieean\e dx=e '. (46) 
0 
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While the general solution of (46) presents problems in 
the weak signal case appropriate to threshold reception, 
we obtain approximately h = 1 — 2 (a/a,)”. Some typical 
OCF curves are shown in Fig. 2. The point at which the 
slope is maximum may properly be called the point of 
greatest ambiguity as to the outcome of the test. 


PROBABILITY OF ACCEPTING H, 


a- TRUE SIGNAL-TO-NOISE RATIO 
a;- PRE- SET SIGNAL -TO-.NOISE RATIO 


L(a) 


0.6 
a/a, 


(0) 02 04 0.8 1.0 12 


Fig. 2—OCF of the incoherent detector; weak signals. (a, a1 < 1). 


Once the OCF is known, the ASN can be computed 
from (29). The additional quantity needed is z(a), and so 
we have 


za) = —a + is [log Ip(2a,x) ]2xe"*"*"Ip(2ax) dx. (47) 
When a = 0, this last expression reduces to the form 
2(0) = —2a? ir [I.(2a,x) /[(2a,x)]a exp (—2”) dx. (48) 
For weak signals (a, « 1), we obtain [22] 

(49) 


2(0) = Salt Sai + Pat = 


For strong signals, we get [22] on the other hand 


2(Gin= —ai(o.45 ee a UES, —---+2I1n as) — 0.178. 
ay ay 
(50) 
From (47) one also finds that for weak signals (a, < 1) 
Te me 2 2 Beye amet oc s) 51 
Ha) = bait - 2ai + 3a! put 2 EOL) 


In Fig. 3, 2(a,) and | 2(0) | are plotted. It is rather re- 
markable to notice that | 2(0) | and z(a,) are approximately 
equal over the entire range of a. This relation is not 
obvious from a simple inspection of (47) and on the 
basis of (30) can be taken to mean that when there is no 
signal (a = 0), a test of strength (a, 8) will last, on the 
average, approximately as long as a test of strength (8, a) 
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Fig. 3—Expected value of z for the incoherent detector. 


when signal appears at the preset level (a a,). The 
relation — 2(0) = z(a,) seems characteristic of sequential 
tests in general and holds exactly when the observed 
variable has Gaussian statistics (see below). 

For a more complete understanding of the mechanism 
of a sequential test it is not enough to know n(0) and 
n(a,), but it is also important to examine the ASN as a 
function of a. A family of such curves is shown in Fig. 4. 
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a-TRUE SIGNAL-TO-NOISE 


RATIO 
0,-PRE-SET SIGNAL-TO -NOISE 
RATIO 
100 
s 
= 
ees 
3) 
08 1.0 
ofa, 1.2 
Fig. 4—ASN (X a;*) of the incoherent detector vs a/a:; weak signals. 
(URC enn) 


The most striking feature is the peak of % at a value 
between a = 0 and a = a,. This peak can be explained by 
the fact that there is no pronounced tendency to cross 
either boundary. The ASN increases as the probability 
of errors is reduced. The essence of sequential procedure is 
that we save on the average number of observations at the 
price of randomizing the sample length n. It is of interest 
therefore to know the variance of n. We denote the 
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variance of n by o,(a), in which a is the true value of the, 
signal-to-noise ratio. Let the variance of z be denoted by; 
o.(a). It can be shown [22] for small @ and 8, and conse- 
quently large 7, that the variance of 7 is approximately 
given by 


o°(0) = (log B)o?(0)/z(0)* (52) 
when no signal is present, and by ' 
o2(a,) = (log A)o2(a,)/2(a,)*. (58) | 


when signal is present. The larger the variance of n, the‘ 
more saving on the Average Sample Number. For small’ 
variance there is little saving, since there is then little’ 
difference between a sequential and a nonsequential test. 


Coherent Detection of Signals in Normal Random Noise. 


The envelope detection which was discussed above. 
requires knowledge of the amplitude but not of the detailed | 
(RF) phase structure of the signal. For coherent detection, 
on the other hand, the fine-structure of the signal must be 
known. Treatment of optimum fixed sample size detectors 
is available in several publications e.g., [5-7, 34]. We shall | 
now discuss sequential, coherent detectors. | 

Suppose a causal signal voltage is given as a function, 
of time by As(t). Let the value of s at the time ¢;, corre-, 
sponding to the 7th observation, be denoted by s;; ‘cam 
s(t,) = s;. We assume s(¢) normalized in such a fashion that. 


lim (1/N) Se = 1; (54) 


N- © i=1 

hence A* measures the mean square value of the obser- 
vations (as N — ) and s(t) gives the normalized wave- 
form. When the signal is absent, A = 0, and when signal 
is present A A,. Let us denote the amplitude of a 
mixture of signal and noise by X. The noise is assumed 
Gaussian with a zero mean and an rms value wy. In this 
example, detection corresponds to a statistical test for the 
mean of X, whose value on the 7th observation is either 
zero (H,) or A, (/,). In the simplest case, all s; are equal. 
More generally, the s; differ from each other but assume 
prescribed values. When the s, are not all equal, a difficulty 
peculiar to sequential detection is introduced: in averaging 
over the test length, the upper limit is itself a random 
variable. Such averaging occurs when the Average Sample 
Number (ASN) is to be computed. If the signal is periodic, 
as is the case in some practical situations, the difficulty 
can be removed, provided tests last long enough to render 
end-effects negligible. The test itself can always be set up 
to have a certain strength; this is assured by the choice of 
thresholds. Analytic difficulties arise only when the 
ASN is to be computed. 

Let us next perform a normalization indicated by 
v = X/p and a = A/y’, so that a is the rms (anput) 
signal-to-noise ratio. We consider x as the observed 
variable and a as the unknown parameter. The null 
hypothesis Ho, of no signal, corresponds to a = a) = 0. 

The alternative hypothesis H,, of signal at threshold 
level, corresponds to a = a,. The m-dimensional distri- 
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bution function of a sample x(a, ---, 2,,) of signal-plus- 
noise under hypothesis H, (k = 0, 1) is 


Wm(X}; G8) = (2n)-"”*[det (6,,) "7 


- exp |- ; De oa (x, 


t,9 


ai A,8;) (a; oT as, | (55) 


in which s is used to indicate the values s,, ---, s,, of 
signal. The determinant of the matrix of covariances is 
denoted by det (c;;) and elements of the inverse matrix 
by o°’; note that det (o,;;) = [det (c'’)]"'. Because of the 
chosen normalization, we have o;; = 1 for all 7. The 
logarithm of the probability ratio w,,(x; a,s)/w,,(x; 0) is 
1 ge ae 

La = ace De G (OSS) — O38;4; — G5S:%;). (56) 
If the signal does not vary from observation to observation 
(s; = s; = 1, etc), one has 


Zn = —-5 Lofat— ale +2). 67) 

An important illustration of the structure of a prob- 
ability ratio detector is supplied by noise which has an 
exponential autocorrelation function, r(f) = exp (—| ¢|). 
Such an auto¢orrelation function is encountered at the 
output of an RC-filter through which “white” noise is 
passed. It also corresponds to the envelope of narrow- 
band RLC noise. If observations are taken at equal 
intervals, say D, the normalized correlation coefficient 
between the 7th and the jth observation depends only on 
|< — 7 |, that is ¢;; = p''~’' where p = exp (— yD). The 
inverse matrix for this case has been obtained by Reich 
and Swerling [6] and is relatively simple: o'' = o”” = 
ap ),o° = (1 + p)/ — op) O41, m), 0° = 
 — — »/(1 — p), all other oc‘? =-0. Substituting 
these values for o’’ in (56), we get 


iil 


Zn = =F ag pen ay(1 + p’)s; — aips,(ss+1 + 8:1) 


i=2 


+ 24,01 + p)six; — 2arpr.(Siz1 + 8-1) + ai(si + 8,) 


= 41 p($:82 + Sm—18m) + 20,(8i1%1 + sate) (58) 
Introducing the standard difference operators As; = 
fee — s, and As, = $42 — 2 8.1 + 8, and neglecting 
the end-effects (terms inz = 1 andi = m give insignificant 
contributions if m is large), we are left with 


m— 1 


z= hat | — 0 Sslats, — 2me,) 


m1 


pene dX As,(ais; — 2a) | (59) 


Lip 


For a small spacing between observations, p = 1 + y At+ 
+++: hence Z,, can be approximated by 


7 


Ue -1 (1/y) >> (ais; — 2a,x,)(y’s; — A’s,)At. (60) 


1=2 
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Transition to the continuous case is effected by letting 
At — 0 and setting the duration 7 of the test equal to 
mAt in the limit. We find that 


ol ae : P 
Zr =—7(1/) f [ais() — 2u2(0]hy's() — 8] at, GD 
0 
exclusive of end-effects, which are ignorable if T >> y~'. 
In particular for a sinewave of radial frequency w), § = 
— wos, so that one has 


Zn = rae + wiy”’) [ [as\() — 2ax(d)s(i)] dt. (62) 


The detection procedure in this example is as follows: 
Supply the detector with the signal waveform  s(é). 
Continue integrating the quantity [ais’(¢) — 2a,x(é)s(d)], or, 
in effect, crosscorrelating x(t) and s(t). Detection termi- 
nates in an alarm or a dismissal, respectively, if either of 
the two limits (4y log A)/(y* + wo) or (4y log B)/(y? + 
wo) is exceeded. 

Another approach to the problem of designing optimum 
sequential detectors for correlated samples with a Gaussian 
distribution is to choose a proper linear transformation 
leading to a new set of variables y,;, ---, Ym, Such that 
the new variables are independent and are all made to 
follow the same distribution. Moreover, the y; are made 
independent of all the x, for k > 7. Using again the 
example of RC-filtered noise, consider the transformation 


Yi = r(1 a2 ye 


Yo Xs 
(63) 


Un SS Le ae Pim 


in which p = Lito 
Clearly, we have y° — y° = 
any y; (except y;) is 


wly;; as;) = [2r(1 — yy? 


2 2 


— Ei, 2-2; = land z; = as;. 
1 — p. The distribution of 


 OXD Vai Sta ps:))/2( <i p)}. (64) 


The logarithm of the probability ratio, &§ = w(y.; as:)/ 
w(y;; 0) is then given by 


&, = 4 ([2a,y(Si41 = pS;) a GS =F psi) |/(1 ae p )i (65) 


Substituting for y; in terms of 2, we obtain (except for &,) 


fy = 3(Siai — p8;)(2a,(%i41 — px;) 


= ay(Si41 = PSs) tls ae 


The detection is performed by checking at each stage 
whether or not yy ~, exceeds one of the two usual bounds, 
log A or log B. As expected, >>; &; checks with Z,, in (58). 
The operation of inverting a matrix, required by the first 
method, and the operation of finding a triangularizing 
transformation are, of course, intimately related (@nversion 
of a large class of correlation matrices is discussed in 
[35]). Either one or the other has to be carried out in order 
to determine the form of an optimum detector. 


(66) 
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The analysis of a sequential detector involves finding 
the OCF and the ASN. The OCF is obtained from the 
standard relation L(a) = (A" — 1)/(A* — B*), in which 
his the nonzero root of the equation 


r 


a —a 


es) 


[ (e0,(X* @rS)/ Wax 0) "ap, (x as\rdx— Ie 
(67) 


Substituting from (55), we find that the condition on h is 


exp e Sle sco 24 — hay) /2} = ib. (68) 
This means that h satisfies the simple expression 
h = 1 — 2a/a,, (69) 


which is independent of sample size. 

This result shows that for normal noise the Operating 
Characteristic Function of a sequential test 1s independent 
of the correlation matrix |o;;|, of the specified signal wave- 
form s;, and of the number of the observations m. Since the 
correlation matrix is not involved, one concludes that 
observations can be allowed to be spaced arbitrarily close 
and hence that, in the limit, the same result carries over 
to the continuous case. 

In Figs. 5, 6, and 7, the OCF’s of the coherent detector 
are plotted. The OCF is important in that it indicates the 
performance of the detector when the actual signal-to- 
noise ratio (a) is not as large as the preset signal-to-noise 
ratio (a,) for which the detector is to have strength 
(a, 8). The OCF is also used in determining the Average 
Sample Number (ASN). The basic expression is here 


Z, = L(a) log B + [1 — L(a)] log A. (70) 


For the coherent detector, from (56), we get 


n 


ry. 1 ij 2 
Z, = z= > Gail Gases: 


ia, Se 


— a8;%; — ase) | (71) 


Whether an explicit expression for % can be found will 
depend on the specific waveform s,, and on the structure 
of the correlation matrix [;;|. 

Consider the case in which the signal value is constant 
at each observation (s; = 1). We then have 


7 ] 2 7 = i} 
i = (a, — 2aa,)B( > G i 


ii 
A simple expression can be found for %, provided the sum 
of all elements of the inverse correlation matrix is a linear 
function of ». As an example, consider noise with the 
exponential correlation function (7.e., RC-filtered noise). 
The inverse covariance matrix in this case has already 
been referred to above. One finds that 


(72) 


> a‘) = [in — 2) = p) +:2]/ ape (73) 


ty) 


from which it follows that for this example 


= 1 fp log B ++ (1 2 L) log A As 2p 
ies p (2aa, — a;)/2 1—p 


(74) 


When there is no correlation between observations 
(po = 0), (74) reduces to a result given by Wald. The 
second term in (74) represent the end effects and will 
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Fig. 6—OCF of the coherent detector; a = B. 
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Fig. 7—OCF of the coherent detector; a = 10-5, 
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not usually be significant. If p is set equal to 1 (complete 
correlation), ” becomes infinite. The stronger is the 
dependence between successive observations, the longer 
the test will last, which is consistent with the fact that 
little new information is supplied by each additional 
observation. 

A series of curves illustrating (74), with p = 0, is shown 
in Fig. 8. Notice the typical peak of the ASN occurring 
between a = 0 and a = a,. The peak becomes more 
pronounced as 6 decreases. It is also interesting to observe 
that for sequential tests, the behavior near a = 0 depends 
almost entirely on 6, and the behavior near a = q, 
almost entirely on a. 


120 
100) ‘fs pote 
n(a) AS NUMBER WHEN o S TRUE 
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0, PRE-SET SIGNAL-TO-NOISE RATIO 
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os 
60} 
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B= 
20) at 
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(e) 02 04 06 08 10 12 15 16 16 2.0 


0/a, 


Fig. 8—Average sample numbers ( X ay) vs a/a;; coherent detection. 
an — Ome (oa— 0) 


Detection of Gaussian Random Signals in Normal Noise 


Consider the case of Gaussian noise with a zero mean 
and a mean square value y. Let the signal also have 
Gaussian statistics with a zero mean but with a mean 
square value A’. Assume that both have the same spectral 
shape (are passed through the same filter) and that the 
normalized correlation matrix of observations is given by 
[c;;]. Denote the sequence of observations by X,, X2, --- 
so that o;; = X;X,/(W + A’). The purpose of the test is 
to distinguish between the null hypothesis H, that A = 0 
(no signal) and the alternative hypothesis H, that A = A, 
(signal A, present). Statistically this is the test for variance 
of a Gaussian distribution with a known mean, in which 
observations are not necessarily independent. 

The first step in construction of such a test is to find the 
logarithm of the probability ratio. Let us introduce the 
normalization a = A/wW3. The distribution function of a 
sample of m observations is then 


W,(X; a) = [2ry(l + a)" [det (0: 


Il a”? X ;X; K 

“exp e yi apg ry ). (75) 

The logarithm of the probability ratio, Z, = log [W,, 
(X; a,)/W,,,(X; 0)], is therefore ee by 

A -} m log (1 + ai) alae pe 1X v, (76) 


The form of the detector is now in principle known. The 
actual instrumentation of a detector so defined must, of 
course, depend on the matrix [c’’]. For the example of 
RC-fltered “white” noise. which we have already once 


Bussgang and Middleton: Optimum Sequential Detection of Signals in Noise 15 


used as an illustration, the matrix [c’’] is specified in the 
preceding section. Substituting into (76) and introducing 
a difference operator A°X,; = X,., — 2X, + X ;-1, we find 
for this particular example that 


il 
i a m log (1 + aj) 


Dine 72k eee 2 
rrr (rete pa man) 


shi 7p) Ota XG ep Nee XK | (77) 


If the observations are uncorrelated (p = 
simply 


0), we get 


Wie dee 
Once again, the Operating Characteristic Function 
L(a) = (A" — 1)/(A* — B?*) involves the solution of the 
parametric equation 


(79) 


Substituting from (75) and carrying out the integration, 
we find that the condition on fh is in our case 


hai + @)/1 + a) =1—(1 + a)7 3 (80) 


This expression is identical with the relation for h which 
one would obtain if observations were uncorrelated [9], and 
it is hikewise independent of sample size. 

We conclude that for a sequential probability ratio 
detector and Gaussian noise and signal, the probability 
of which decision will terminate the detection process is a 
function of the actual (a) and the preset (a,) signal-to- 
noise ratios only. It does not depend on the frequency 
spectrum nor on the spacing between observations. 

In order to study the ASN, we consider Z,, the expected 
value of Z,. Using (76), we get 


Dig ae 3m log (1 + ai) + = 


(78) 


ihe [Wil Xa) Woe Oe 


- W,( 43a) dX = 


= Ae 2 1 a; Hy . gi 
l= 5% log (1 + a) + 31 au Do X.X/¥). 


(81) 


We take the statistical average of each term of the sum. 
But X;X; = o,;; ¥(1 + a’). We also have the relation 


Ds, o''o:;/W = ae an, (3; = Gi). (82) 
i=1 
Hence it follows that 
Z, = —¥illog (1+ ai) — a(l+a°)/(L +a]. (83) 
Using the basic expression (70), we find finally that 
ZH L log 8 + (1 — L) log A (84) 


[—log (1 + a?) + aX(1 + a)/(1 + ai))/2 


The relation (84) does not depend on the matrix [c"’] 
and is the same as if the observations were uncorrelated 
[9]. We have already determined that the same holds true 
for (80). We conclude that for a normal random signal 
and normal noise with identical fre« , 
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number of observations required depends only on the 
signal-to-noise ratios a and a,, and is independent of the 
between Thus there is no 
dependence either on the actual spectrum nor on the 
spacing between observations. 

This result imples that by selecting points as close 
together in time as we wish, we can make the test last, 
on the average, an arbitrarily short time. The physical sig- 
nificance of this unusual result is closely connected 
with the assumption that a precise correlation matrix of 
observations is known and used in detection. A similar 
notion that arbitrarily strong tests can be performed over 
arbitrarily short time intervals was discovered by W. C. 
Fox [15] in connection with a study of fixed-size sample 
plans. We stress that our conclusion applies only under 
the conditions of this particular example; the process 
must have Gaussian statistics and the matrix [o°’] must 
be known exactly and used in the construction of Z,,. 


correlation observations. 


Truncated Tests 


There are two reasons why a standard sequential 
procedure may be unsatisfactory. These are: 1) even if 
signal is present, an individual test may last longer than 
can be tolerated, 2) the average length of the test, when 
signal appears below the pre-set level, becomes extremely 
large, if the probabilities of error a and 6 are chosen to be 
very small. In some situations it may become virtually 
necessary to interrupt the standard procedure and resolve 
between the alternate courses of action, there and then. 

If this is done, we speak of a truncated sequential test. 
Tests of this type were originally proposed by Wald 
[9]. The new rules of procedure can be fixed as follows: 
Carry out the regular sequential test until either a decision 
is made or stage N of the test is reached. At stage N, if 
no decision has been reached under the sequential rules of 
operation, accept the hypothesis Ho, if Zy < O or accept 
the hypothesis H, if Zy > 0. Under this new rule the test 
must terminate in at most N stages. Truncation is then 
a compromise between an entirely sequential test and a 
classical, fixed-sample test. It is an attempt to reconcile 
the good features of both of them: the sequential feature 
of examining observations as they accumulate and the 
classical feature of guaranteeing that the tolerances will 
be met with a specified sample size. 

A truncated test can last no longer than N observations. 
The Average Sample Number fir of a truncated test will 
differ from the ASN % of an untruncated test, even although 
both use the same bounds A and B. The difference can be 
expressed in terms of the probability density P(n; a) of 
the size n of an untruncated test. We get [22] 

Nn—-Nnrp = | (n — N)P(n; a) dn. (85) 

ke 

Since this integral is always positive it follows that a 
truncated test will take, on the average, a smaller sample 
than an untruncated test. This does not contradict the 
optimum property of sequential detectors, because a 
truncated process has no longer probabilities of error 
(a, 8), but has some modified strength (a’, 6’). 
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Wald [9] has established upper bounds on a’ and #’ fon 
the case of normal distribution of z and independent 
observations. His upper bounds are 


a’ <at (Ge) — Go)] (86) 

and 
B’ < B+ (GO) — GO] (87) 
where GG): =n [ exp (—a’/2) dz, (88) 
y, = —Ne(a,)/[No2(a,)]'? (89) 

and 
vk = [log K — Ne(a,)]/[No2(a,)]'” (90) 
K=A if k=0 and =B if b=1. “OR 


Wald’s upper bounds turn out to be appreciably above 
the actual probabilities of error. These probabilities, 
rather than just the actual bounds, can be expressed by 
introducing p(n | d,; a;), the conditional distribution of 
mn under the restriction that the terminal decision is to 
accept H,, when a; is true (k = 0, 1;7 = 0, 1). Notice that 

P(n; a) = Lp(n | dos a) + (CL — L)p(n | d,; @) (92) 

| 


in which all the three distribution functions refer to an 
untruncated test. It can be shown [22] that 


N 
eo] p(n | d,; 0) dn | 


+ yells) = GO%)) [| P(r; 0) dn (98) 
and 
8’ = B ibe p(n | do; a;) dn 
+ ralG@) = Go)] | Pasa) dn, (8) 
in which 
shes | Wo 8 ay | (95) 


and W(Zy;a) is the probability density of Zy. A comparison 
of Wald’s upper bounds in (86) and (87) with the ex- 
pressions in (93) and (94) reveals that the latter differ 
by taking into account the probability that the truncation 
stage is reached. It can be verified; e.g., that if a = 8, 
and N = 7 the relation (86) gives a’ < (a + 4), while 
(93) gives a’ = 4(a + $), so that the upper bound can 
be as much as twice the actual value. 


Comparison of Sequential with Nonsequential Optimum 
Detectors 


In comparing sequential with fixed-sample size (con- 
ventional) optimum detectors, we must remember that 
the sample size of a sequential test is a random variable 
whose expected value depends on the true signal-to-noise 
ratio. Thus when we compare sample sizes of two tests! 


1955 


we must specify what signal-to-noise ratio is assumed 
actually present. 

Consider a sequential and a conventional test of the 
same strength (a, 6), and with the same preset signal-to- 
noise ratio a,. The Average Percentage Saving S is defined by 


S(a) = 100[1 — n(a)/nJ%, (96) 


in which n, is the sample size of the conventional test. 
For a normal distribution of the logarithm of probability 
ratio (for all practical purposes this is the case of small 
a and 6) and independent observations, it turns out 
(9, 22] that 


a.) = 100{1 + 9@, a)/[O"U — 2a) +O" — 28) }% 
(97) 


in which 


O(u) = (2/Wn) ti dy exp (-y’). 


Another interesting way to compare two detectors is to 
require that the strength (a, 8) and the average sample 
sizes be identical n, = n(a,) and then compare the 
minimum detectable signals (say, a, and a,). The saving 
in the minimum detectable signal power is 


- 


S = 100(1 — as/a;)%. (98) 


For a coherent detector it turns out that the functions 
given by (96) and (97) are identical. Typical curves are 
shown in Fig. 9. 


EN ao, = ae OR 


I - i = 
S =100 [1 (7 0%)] % WHEN n (o,) =ny 


MINIMUM DETECTABLE SIGNAL OF A SEQUENTIAL TEST 


MINIMUM DETECTABLE SIGNAL OF A FIXED-LENGTH (n¢ ) TEST 


AVERAGE PERCENTAGE SAVING S 


PROBABILITY OF FALSE DISMISSALS 


iSmnion 102 10-4 1075 


oe 
PROBABILITY OF FALSE ALARMS a 


Fig. 9—Average percentage saving; coherent detection. 


A comparison of the minimum detectable signals for 
the incoherent, sequential and “ideal” [8] modes of 
operation (a = 8; weak signals) is made in Fig. 10. 
Under the circumstances shown, a sequential observer 
detects signals 2-3 db weaker. 

For a complete analysis of sequential detectors the 
distribution function of 2 is needed. Some studies of the 
sequential distribution functions have been carried out 
[9, 22]. One general observation may be made: sequential 
detection can be, on the average, carried out faster than 
conventional detection; however, this occurs at the 
expense of the detection time becoming a random variable. 
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Thus, large savings in the sample size are associated with 
a large variance’) and there is an appreciable probability 
that a particular run will exceed the desired average 
length. Typically, with a = 6 = 10°’, the dispersion 
will be about 50 per cent of the average sample size. 
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Fig. 10—Minimum detectable signal; comparison of observers. 
a’? vs n(a). 


List or Masor SYMBOLS 


A upper boundary of a sequential test 

ASN Average Sample Number 

a true value of the signal-to-noise amplitude 
ratio 

do = 0 signal-to-noise ratio associated with hypo- 
thesis H, 

ay signal-to-noise ratio associated with hypo- 


thesis H,. 

lower boundary of a sequential test 

terminal decision to accept hypothesis H, 
(k=40 eu) 

expected value of x when @ is true 

null hypothesis: a = 0 

alternative hypothesis: a = a, 


L(a) probability of accepting H, 

m a stage of the sequential detection process 

n the terminal stage of a sequential detection 

are process 

n(6) alternative notation for E[n(@)]. 

N truncation stage 

OCF Operating Characteristic Function 

P(n; a) distribution function of n when a is true 

p(n | d,; a) conditional distribution function of m when a 
is true under the restriction d, 

ih time from the start of a sequential detection 
process (cf. m) 

3 time from start to termination of a sequential 
detection process (cf. 7) 

5 Average Sample Length (cf. 7) 


§ See remarks at the end of the first section of Part II, concerning 
the necessity of having a large variance if the sequential test is to 
have a different performance from a fixed sample test. 
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W(X} @) m-dimensional probability distribution — of 
sam pleas 6 in 

x sample composed of observations 21, %2, ---, 
Un 

Fie logarithm of probability ratio of m observa- 
tions 

Zr logarithm of probability ratio over interval T 

Z average of the logarithm of probability ratio 

Z logarithm of probability ratio for a single 
observation 

a probability of a false alarm 

a’ probability of a false alarm for a truncated 
test 

B probability of a false dismissal 

B' probability of a false dismissal for a truncated 
test 

> probability ratio at the mth observation 

BIBLIOGRAPHY 
[1] Marcum, J. I., A Statistical Theory of Target Detection by Pulsed 


[2] Hanse, H., 


[5] Middleton, D 


{10 


{il 


{12 


Radar, Rand Corp., RM-754, December 1, 1951; ibid., ‘‘Mathe- 

matical Appendix,’’ RM-753, July 1, 1948. 

“The Optimization and Analysis of Systems for the 

Detection of Pulsed Signals in Random Noise,’ D.Sc. Thesis, 

Massachusetts Institute of Technology, January, 1951. 

Schwartz, J., “A Statistical Approach to the Automatic Search 

Problem,” Ph.D. Thesis, Harvard University, June 1951. 

Slattery, T. G., “The Detection of Sine Wave in Presence of 

Noise by the Use of Nonlinear Filter,’ Proceedings of the IRE, 

Vol. 40 (October, 1952), p. 1232. 

, Statistical Criteria for the Detection of Pulsed 

Carriers in Noise, Air Force Cambridge Res. Cent. Rep., 

November, 1952. 

Reich, E. and P. Swerling, ‘“The Detection of a Sinewave in 

Gaussian Noise,”’ Journal of Applied Physics, Vol. 24 (March 

1953), p. 289. 

] Middleton, D. “Statistical Criteria for Detection of Pulsed 
Carriers in Noise: I, II,” Journal of Applied Physics, Vol. 24 
(April 1953), pp. 371-378; pp. 379-391. 

|] Siegert, A. J. F., in J. L. Lawson and G. E. Uhlenbeck, Threshold 


Signals, Radiation Laboratory Series (McGraw-Hill, 1950), 
Vol. 24, Ch. 7. 
] Wald, A. , Sequential Analysis. New York, John Wiley and Sons, 
Inc., 1947. 


“Optimum Character of the Se- 


] Wald, A. and J. Wolfowitz, 
Annals of Mathematical 


quential Probability Ratio Test,” 
Statistics, Vol. 19, p. 326 (1948). 

] The Statistical Research Group, Columbia University, Sequen- 
tial Analysis of Statistical Data: Applications, Columbia Uni- 
versity Press, New York, 1945. 

] Stein, C., “A Note on Cumulative Sums,” 
matical Statistics, Vol. 17 (1946), pp. 498- 499. 


Annals of Mathe- 


CZ) 


[13] 


{14 


31] 


32] 
33] 
34] 


35] 


] Bussgang, de “dor 


December 


Girshick, M. A., “Contributions to the Theory of Sequentia! 
Analysis Ie iat and Ill,” Annals of Mathematical Statistics, Vol. 
17 (1946). 

Barnard, G. A., ‘Sequential Tests in Industrial Statistics,” 
Supplement of the Journal of the Royal Statistical Society, Vol. 8, 
No. 1 (1946). 

Fox, W. C., Signal Detectability: A Unified Description of 


Statistical Methods Employing Fixed and Sequential Observation 
Processes, Electronic Defense Group, Univ. of Michigan, Tech. 
Rep. No. 19, December, 1953. 

Peterson, W. W., T. G. Birdsall, and W. C. Fox, “The Theory of 
Signal Detectability, ” 1954 Symposium on Information Theory, 
Transactions of the IRE (PGIT), Vol. 4 (September, 1954), 
pp. 179-182. 

Harrington, J. V., An Analysis of Detection of Hepes Signals 
in Noise by Binary Integration, MIT Lincoln Lab., ep. 
No. 13, August 14, 1952, and Transactions of the TRE (PGIT), 
Vol. 1 (March, 1955), pp. 1-9. 

Reed, I. 8., Analysis of Signal Detection by Sequential Observer, 
MIT Lincoln Lab., Tech. Rep. No. 20, March 12, 1953. 

Reed, I. S. and G. P. Dineen, MIT Lincoln Lab., personal 
communication. 

Blasbalg, H., Theory of Sequential Filtering and Its Application 
to Signal Detection and Classification, Johns Hopkins Univer- 
sity, Radiation Lab., Tech. Rep. No. "AF-8, October 18, 1954. 
“Sequential Detection of 'Signals in Noise,” 
Ph.D. Thesis, Harvard University, 1955. 

Bussgang, J. J.. and D. Middleton, ‘Sequential Detection of 
Signals in Noise,” Harvard Cruft Lab., Tech. Rep. No. 175, 
1955. 


Van Meter, D. and D. Middleton, ‘‘Modern Statistical Ap- 
proaches to ets in Communication Theory,” Transac- 
tions of the IRE (PGIT), Vol. 4 (September, 1954); p 119. 


Middleton, D. and D. Van Meter, “Detection and Extraction 
of Signals i in Noise from the Viewpoint of Statistical Decision 
Theory,” Soc. for Ind. and Appl. Math., Vol. 4, No. 4 (Dee., 
1955) ‘and Vol. 5, No. 1 (March, 1956). 

Wald, A., Statistical Decision Functions. New York, John Wiley 
and Sons, Inc., 1950. 

Lehmann, E. L., “On Families of Admissible Tests,” Annals of 
Mathematical Statistics, Vol. 18, 1947. 

Stein, C. M., “On Sequences of Ixperiments,” 
Annals of Mathematical Statistics, Vol. 19 (1948). 
Blackwell, D. and M. A. Girshick, Theory of Games and Statis- 

tical Decisions. New York, John Wiley Sons, Inc., 1954. | 
Mood, A., Introduction to the Theory of Statistics. New York, 
McGraw- Hill, 1950. 

Darling, D. A. and A. J. F. Siegert, “The First Passage Problem‘ 

for a Continuous Markoy Process,’ Annals of Mathematical 

Statistics, Vol. 24 (December 1953), pp. 624-639. 

Kemperman, F., ““The General One-Dimensional Random Walk 

with Absorbing Barriers with Applications to Sequential Analy- 

sis,’ University of Amsterdam, Thése de Doctorat, 1950. 

Rice, S. O., Mathematical Analysis of Random Noise. Bell Tele-; 
phone Sy stems, Monograph B-1589, 1944-1945. 

Weed P. M., Probability and Information Theory, with’ 
Applications to Radar. New York, McGraw-Hill, 1953. 

Middleton, D., The Statistical Theory of Detection I: Optimum, | 


abstract in’ 


Detection of Signals i in Noise, MIT Lincoln Lab., Tech. Rep.., 
No. 35, November, 1953. 

Stein, S. and J. E. Storer, “Generating a Gaussian Sample,” to 
be published i in the 1955 Wescon Convention Record. 


IRE TRANSACTIONS—INFORMATION THEORY 19 


On Binary Channels and Their Cascades 


RICHARD A. SILVERMANt 


Summary—A detailed analysis of the general binary channel is 
given, with special reference to capacity (both separately and in 
cascade), input and output symbol distributions, and probability of 
error. The infinite number of binary channels with the same capacity 
lie on double-branched equicapacity lines. Of the channels on the 
lower branch of a given equicapacity line, the symmetric channel 
has the smallest probability of error and the largest capacity in 
cascade, unless the capacity is small, in which case the asymmetric 
channel (with one noiseless symbol) has the smallest probability of 
error and the largest capacity in cascade. By simply reversing the 
designation of the output (or input) symbols, we can decrease the 
probability of error of any channel on the upper branch of the equi- 
capacity line and increase the capacity in cascade of any asymmetric 
channel on the upper branch. 

In a binary channel neither symbol should be transmitted with a 
probability lying outside the interval [1/e, 1 — (1/e)] if capacity is 
to be achieved. The maximally asymmetric input symbol distributions 
are approached by certain low-capacity channels. For these channels, 
redundancy coding permits an appreciable fraction of capacity in 
cascade if sufficient delay can be tolerated. 


CAPACITY AND SYMBOL DISTRIBUTIONS 


ISCUSSION of the binary channel is_ usually 
LD confined to the symmetric case, where each of 
the transmitted digits is similarly perturbed by 
the noise. However, many interesting features of binary 
channels are concealed if only symmetric channels are 
considered. Accordingly, this paper will be devoted to a 
detailed study of the arbitrary binary channel. 
Let the channel be characterized by the transition- 
probability matrix 


a i = w@ 
Bia 18 
where a is the probability that a zero be received as a 
zero, 8 the probability that a one be received as a zero, etc. 
We shall use the symbol C for both the channel and its 


matrix, but no confusion will arise. Computations are 
simplified by defining (after Muroga’) an auxiliary vector 


C= ) Ola, bisa (1) 


Xx, 
which solves the equation 
CX = -H. 
Here H is the row-entropy vector of the channel C; 7.e., 
fees |e 
H(8) 


where H(z) is the entropy function 


+ Formerly Dept. of Electrical Engineering, Massachusetts Insti- 
tute of Technology, Cambridge, Mass.; now Institute of Mathe- 
matical Sciences, Div. of Electromagnetic Res., New York Univ., 


is, Muroga, “On the capacity of a discrete channel. I. J. Phys. 
Soc. Japan, vol. 8, pp. 484-494; July-August, 1953. 


A(x) = —«x log « — (1 — 2) log (1 — 2). 


(All logarithms are to the base 2 unless otherwise indicated; 
as another notation for 2, we shall write exp, x.) Muroga’ 
has shown that the capacity c(C) of the channel C can be 
written in terms of the components of the vector X as 


e(C). = log 2 (2) 

The transmitted and received symbol distributions are 

also simply related to X. Let P be a vector representing 

the transmitted symbol distribution which achieves 
capacity, 7.€., 

P, | 

La 0 | 


? 


where P, is the probability with which zeros should be 
chosen if capacity (maximum rate) is to be achieved. 
Let P’ be a vector representing the corresponding received 
symbol distribution, 7.e., 


Ps 
Ss ts 


pr = 


where P% is the probability that a zero will be received if 
the transmitted symbol distribution achieves capacity. 
The vectors P and P’ are related by the equation 


P’ = CP, 
where C denotes the transpose of the matrix C. In terms 
of the auxiliary vector X, Muroga finds that 


me Xo Xs 
(Ona det ) (3) 
det (C) paeies 
and that 
PUEO er (4) 


Our task is to express the quantities (2), (8), and (4) in 
terms of the parameters a and £ of the binary channel 
(1). After some algebraic manipulation we find that 


—BH(a) + aH(8) 
(3) = 


cla, B) = 


Hesse) | 6) 


+ log [1 + exp. ( ee 


Polar, 8) = BB — a)” 
~ (3 -o)"[1 + exp, (HP@=H®)|", 


ee)" (7) 


Pia, B) = E + €XP2 ( eae ‘ 
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Each of the quantities (5), (6), and (7) depends on the 
channel parameters a and @: (5) gives the capacity of C, 
(6) the probability with which zeros should be chosen at 
the transmitter if capacity is to be achieved, and (7) the 
probability of a zero appearing at the receiver if zeros are 
chosen at the transmitter in accordance with (6). 

Each of the functions c(a, 8), Po(a, B), and Pé(a, B) 
defines a surface over the unit square 0 < a, 8 < 1. A 
study of the expressions (5), (6), and (7) reveals the 
following symmetries: 


cla, B) i c(B, a) ao ae! =e, B) 


St OL any Sb meee") (8) 
P(a, 8) = P(l — a,1 — 8) = 1 — PB, a) 

wih Venn R lle a). (9) 
Pola 8) = PiB; a) = 1 — Pol alee) 

SESW REE — fouls wy, (10) 


(I-e9,! Bo) 
Ze 


Fig. 1—Illustrating the symmetries of c(a, 8), Po(a, 8), ete. 


These symmetries are illustrated in Fig. 1, where a point 
(a, Bo) and its reflections in the 6 = aandB = 1—a 
lines are shown. Eq. (8) shows that the capacity has the 
same value at any four such symmetrically placed points 
(for a reason to be discussed shortly), (9) that P, has the 
same value at points 1 and 3 and one minus that value at 
points 2 and 4, and (10) that Ps has the same value at 
points 1 and 4 and one minus that value at points 2 and 3. 

Fig. 2 shows lines of constant capacity (equicapacity 
lines). Along the line 8 = a the capacity vanishes, 
corresponding to the vanishing of det (C). The line 
8 = 1 — ais the locus of symmetric channels, and along 
this line (5) reduces to the familiar expression 


ca,a) = 1 — Ha). 


The slope of the curves c(a, 0) and c(0, 8) 
a=p= 
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Along the lines a = 0, a = 1, 6 = 0, and B = 1, (5) 
reduces to especially simple expressions. For example, 


c(a, 0) = log [1 + exp, (—H(a)/a)], OS 6S<1. 


at the point 
0 is log e/e. It is clear from Fig. 2 that there are 
an infinite number of binary channels with the same 
capacity. This is to be expected since two parameters are 
required to uniquely specify a binary channel. 


Z| 


Fig. 2—Lines of constant c(a, 8). 
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The fact that the four channels C(a, 8), C(8, a), 
C(l — a, 1 — 8), and C(1 — 6, 1 — a) have the same 
capacity, which produces two symmetrically placed 
branches of each equicapacity curve is easily explained. | 
Clearly it is a matter of indifference which input (or 
output) symbol we choose to call a zero and which we 
choose to call a one. Reversing the designation of the 
input symbols corresponds to premultiplication by the 
noiseless matrix 


Orel 
1) 


If = 


and maps the channel C(a, 8) into the channel C(@, a). 
Reversing the designation of the output symbols corre- 
sponds to postmultiplication by the matrix J and maps 
the channel C(a@, 8) into the channel C(1 — a, 1 — 8). 
Reversing the designation of both the input and output 
symbols corresponds to premultiplication and post- 
multiplication by the matrix J, and maps the matrix 
C(a, 8) into the matrix C(1 — 6, 1 — a). As (9) and (10) 
show, there are properties that, unlike capacity, are not 
invariant under all these mappings. 

We have just seen that from a given point on an 
equicapacity line at least three other points can be 
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reached by multiplying the channel matrix by another 
channel matrix. That there are no more such points is an 
immediate consequence of a theorem proved by DeSoer* 
to the effect that the capacity of two channels in cascade 
is less than the capacity of either unless one is a noiseless 
channel (7.e., the unit matrix or one of its permutations) 
or unless one is a completely noisy (zero-capacity) 
channel.*’* (The reader is reminded that connecting two 
(or more) channels in cascade corresponds to multiplying 
the corresponding matrices.) Our statement follows from 
the fact that there are only two noiseless binary channels, 
namely J and the unit matrix. 


Fig. 3—Lines of constant P’o(a, 8). 


Fig. 3 shows lines of constant Pj(a, 8), the probability 
of receiving a zero if the input symbol distribution achieves 
capacity. Along the line 8 = 1 — a, the locus of symmetric 
channels, P(a, 6) has the familiar value 3. Along the 
zero-capacity line 8 = a, Pi(a, 8) has the limiting value 
a, although (7) is indeterminate for 8 = a. That is 

lim Pi(a + €,a) = lim Pifa,a+ 6 = 
«0 «0 

Fig. 4 shows lines of constant P,(a, 8), the probability 

with which zeros should be transmitted to achieve 


2C. A. DeSoer, “Communication through channels in cascade,”’ 
Sc.D. Thesis, January, 1953, Dept. of Elect. Eng., M.I.T. 

2 There is also the intermediate case where the channel ae is 
reducible and one of the submatrices is completely noisy, , the 


channel with matrix 


Aut 20 
— 
Oo @ 


[his channel (cited by Shannon in reference 4) is effectively the unit 
matrix, since the symbols corresponding to the first and second rows 
sroduce ee euaiable effects at the receiver. 

EC: Shannon and W. Weaver, The Mathematical Theory of 
nication University of Illinois Press, 1949, pp. 44-45. 
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capacity. The surface is saddle-shaped with the saddle 
point at a = 6 = 4. Along the symmetric channel line 
B = 1 — a, P,(a, B) has the familiar value 4. Along the 
zero-capacity line 8 = a, Po(a, B) has the limiting value 
3, although (6) is indeterminate for 8 = a. That rae 

lim P,(a + €,a) = an Poa,ate = 

e>0 
The behavior of Po(a, 
a= p= 
discussion. 

Suppose we approach the point a = 8 = O along the 

line a = ¢«, 8 = re, where0 < r < &;7.., along any line 
between the positive a-axis (r = 0) and the positive 
B-axis (r = o). Then lim .., P(e, re) takes on all values 
between 1/e and 1 — (1/e), depending on the value of 7, 
provided that for the value r = 1 (for which the single 
limit is indeterminate) we take the double limit lim 
lim.»+o Po(e, « + ¢’). For example, 


8B) at the corners a = B = O and 
1 is sufficiently remarkable to warrant special 


0.4 


. 4—Lines of constant P.(a, 8). 


lim Po(e, 0) = lim [e(1 + exp, (H(6)/e)]’ 


e—>0 
€ = 
==" ln) E + (1 _ ‘).| = le; 
€>0 2 
lina?76(0; €) =e il Ts lim Pole, 0) = 1 ae (1/e), 
e—0 «0 


lim P,(e, 4€) = lim — 1 + [(e/4) 4+ (€/2) — (3ee/16)] 


I 


«0 «0 
i= (4/e) w 1; 
lim lim P,(e, « + ¢’) = lim lim 
e-0 €’-0 e>0 ¢€’0 
eee SEE 1 
U ae ca? aes y Glare + Ae’) = oy 
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In evaluating the limits we have made free use of the 
expressions 
H(x) = x log (e/x) — (a/2) log e + O(2’), 


(d/dx)H(x) = log (1 — 2)/z, 


I 


H'(x) 


—H''(x) = —(@/dx)H(2) = x "(1 — 2) loge. 


We see that the point a = 6 = 0 (and its image a = 6 = 1 
in the line 6 = 1 — a) is a point of discontinuity of the 
function P,(a, 6), whose limiting behavior there depends 
on the direction of approach in a sort of ‘spiral staircase”’ 
fashion. 

The maximum value of P,(a, 8) is the limiting value 
1 — (1/e) obtained when we approach the point a = 6 = 0 
along the positive 6-axis, and the minimum value of 
P (a, 8) is the limiting value 1/e obtained when we 
approach the point a = 6 = 0 along the positive a-axis. 
(There are two corresponding limits at the point 
a = 6 = 1.) Of course, the channel capacity is zero in 
both limits, so that there is no channel with positive 
capacity whose input symbol distribution is as asymmetric 
as P = [1/e,1 — (1/e)] or P = [1 — (1/e), 1/e]. However, 
there are low-capacity channels whose input symbol 
distributions are arbitrarily close to these maximally 
asymmetric ones. These low-capacity channels wil] be 
discussed further below. 

We see that in a binary channel neither transmitted 
symbol can be selected with a probability lying outside 
the interval [1/e, 1 — (1/e)] if capacity is to be achieved. 
If we are compelled to send digits from a more asymmetric 
distribution (as we may well be), the possibility of signal- 
ing at capacity is precluded from the start. Intuitively, 
this means that in a binary channel no decrease in equivo- 
cation obtained by skewing the input symbol distribution 
can justify making the source entropy less than H(1/e), 
at least if obtaining maximum rate (capacity) is the 
objective. 

For channels with larger alphabets it may be quite 
proper to choose one or more transmitted symbols with 
probability less than 1/e, or indeed to suppress one or 
more transmitted symbols. Thus, for example, in the 
ternary channel 


a lL—a 0 
0 4) 
0 l—aa 


capacity cannot be achieved if the symbol corresponding 
to the second row is transmitted, unless a < a, where 
a) ~ 0.64 is the solution of the equation 


loga = —a. 


Muroga gives many other examples of the need for 
suppressing possible transmitted symbols in his basic 
paper. He was the first to point out the need of taking 
special care that P, does not become negative in capacity 
calculations. 
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The four channels which in the zero-capacity limit 
achieve one of the maximally asymmetric input distri- 
butions P = [1/e, 1 — (1/e)] or P = [1 — (1/e), 1/e] have 


matrices 
e l—e ll =e. 
0 1 1 0) 
en) 
0 1 1 0 
e l—e ih=—@ & 


“cc 


We shall refer to these channels as “‘e-channels’’. To the 


first order in e, they all have capacity 


Cc = “ee e ~ 0.58e bits. 


(12) 
Introducing the abbreviation k = (2 — e)/e ~ —0.27, 
we find that the input symbol distribution which achieves 
capacity for the first and second of these channels is 


\-1 
P af (e se ke) (13) 
1— (e+ ke) 
whereas for the third and fourth it is 
pas Saul : 
pont Pel a4) 
(e + ke)" | 
The corresponding output symbol distributions are 
=i 
pee e(e + ke) (15) 
1 — ee + ke) 
for the first and third channels, and 
— nea 
Bas 1 e(e + ke) (16) 
e(e + ke)” 


for the second and fourth channels. Eqs. (13) through (16) 
are accurate only to the first order in e. 


PROBABILITY OF ERROR 


We have seen that there are infinitely many binary 
channels with the same capacity. It is natural to ask 
whether there are contexts in which any of these channels 
with the same capacity is to be preferred to the others. 
Two questions that we might ask are: 1) Which of the 
channels with the same capacity has the smallest prob- 
ability of error (in a received digit), and 2) Which has the 
largest end-to-end capacity if its output terminals are 
connected to the input terminals of an identical channel?° 
We answer the first question in this section and defer 
discussion of the second question until the next section. 

The probability of error (at capacity) is given by the 
expression 


Pa, 8) =8+(1—-a—s)Poe,8), (17) 


* This question arises naturally if we consider building up a cas. 
cade of repeaters, using a given binary channel as a unit. 
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where P,(a, 8) is the probability of a transmitted zero as 
given by (6). It is easily verified that P,(a, 8) has the 
following symmetries 


eee 0, a) Ee, a) 


ibd (A Aa cyes | 5,8) (18) 


In deriving (18) free use has been made of the symmetries 
of P,(a, 8) as given by (9). Referring to Fig. 1, we see 
that P.(a, 8) has the same value at the points 1 and 2 
and one minus that value at the points 3 and 4. (Note 
that none of the functions c(a, B), Po(a, 8), Pé(a, 8B), 
and P,(a, 8) has the same symmetries. ) 

Fig. 5 shows lines of constant P.(a, 8). Along the 
symmetric channel line 8 = 1 — a, P.(a, 1 — a) has the 
familiar value 1 — a. Along the zero-capacity line 8 = a 
P.(a, a) has the limiting value 4. At the point a = 
8 = 1 corresponding to the channel matrix J, P, 
whereas at the point a = 1, 6 = 9 corresponding to the 
mit matrix, P, = 0. 


KO 


He 
me © 


Fig. 5—Lines of constant P.(a, 8). 


We have already noted that Po(a, 8) has discontinuities 
at the corners a = 8 = 0 and a = 6 = 1, and indeed that 
any value between 1/e and 1 — (1/e) can be obtained by 
approaching these discontinuities along the proper direc- 
tions. Eq. (17) shows that P.(a, 8) shares these dis- 
continuities. For since 

lim P.(a, 8) = 


lim P,(a, 8), 

a,B-0 a ,B—0 

the limiting behavior of P,(a, 8) at the two discontinuities 
is identical with that of Po(a, 8), however different the 
over-all appearance of the two sets of curves. This fact is 
apparent from Fig. 5. In particular, it follows that the 
curve P,(a, 8) = 1/e (not shown) must come into the 
points a = B = 0 anda = 8 = 1 with zero slope. 
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As we have seen, the portion of the equicapacity curve 
lying in the triangular region 8 < a, 8 < 1 — a generates 
the rest of the equicapacity curve under the mappings 
corresponding to reversing the designation of the input 
symbols or output symbols or both. Eq. (18) shows that 
the probability of error for channels on the upper branch 
of the equicapacity curve (above the line 8 = a) is greater 
than for channels on the lower branch (below the line 
8 = a), and indeed is greater than $. However, a value 
of P, greater than 4 is artificial, for if communication is 
through such a channel, the receiver can obtain infor- 
mation at the same rate and with probability of error 
one minus that value merely by reversing the designation 
of the received symbols. (The transmitter need not be 
informed of this reversal, for (9) shows that the input 
symbol distribution remains the same in the reversed 
channel.) Thus our problem reduces to finding which of 
the channels on the portion of the equicapacity curve 
lying in the triangular region B < a, 8B < 1 — @ has the 
smallest probability of error. 

The question is immediately answered if we superimpose 
the curves of Figs. 2 and 5. We find that a symmetric: 
channel with a given capacity has a smaller probability 
of error than an asymmetric channel with the same 
capacity, unless the channels have very low capacity. In the 
latter case it is easily verified that the symmetric channel 


1/2 + (¢/2e)” 
1/2 — (€/2e)'” 


1/2 — (¢/2e)'”? 
1/2 + (¢/2e)”? 


has the same capacity as the asymmetric e channel 


@ Ik=« 
0 ibs 
namely 
loge ,. 
-¢ bits. 
e 


Using (13) and (17), we find that the probability of error 
for the asymmetric channel is 


(Pearman (l= @)/Ke hel IK eae iy 


whereas that of the symmetric channel is obviously 
(Pee 12/20 & 
For small e, it is apparent that 
(Posen Caer) 
as asserted. Indeed 


limi.) n= (19) 


«70 


L/e, 


whereas 


lim (Pan = op 


«0 


(20) 
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CASCADED CHANNELS 


We turn to a discussion of the second of the questions 
raised at the beginning of the preceding section: Which 
of the channels with the same capacity has the largest 
end-to-end capacity if its output terminals are connected 
to the input terminals of an identical channel? 

First we remind the reader (see the first section) that 


jCmac@), 


7.e., capacity is a decreasing function of the number of 
cascaded stages, unless Cis one of the noiseless channels 

| 

ede A 
1 


[1 0 


0 iif 


or one of the zero-capacity channels 


(21) 


The reader should further note that, unless C is one of 
the two noiseless channels, 

lim ¢(C”) = 0. 
For then C is irreducible, and by a well-known theorem 
on Markov chains” lim,.,.. C” has the form (21), and 
consequently 


lam.e(C?) e=e( HineG a) a=. 0; 


no no 


We begin by squaring the matrix C(a, 8), obtaining 
A(a, 8) 1 — A@, 8B) | 


Bla, B) 1— Be, B) 


Cis 


where 
A(q, B) = ae =e (1 a a)B, 
Bia, B) = a8 + (1 — BB. 


Thus to every channel (a, 8) on a given equicapacity 
curve corresponds a squared channel (cascaded with 
itself with matrix elements A (a, 6) and B(a, 8). The func- 
tions A(a, 8) and B(a, 8) exhibit the following symmetries: 


A(a, B) aed in Ba a B; a a), (22) 
Ba, 8) =gla— A(1 ‘oe (eine =<) 

and 
A(B, a) =l—- Ba — 0% i= 8), (23) 
B(B,a) = 1— AQ —a,1 — 8). 

However, there is no simple relation between A(a, 8) and 


A(6, a), or between B(a, 8) and B(B, a) [unless 8 = | — al, 
so that two quite different curves are produced by squaring 
the matrices corresponding to a given equicapacity line, 


“Probability Theory and Its Applications,” vol. 1 


6 W. Feller, 
1950. Reference is made to 


New York, John Wiley and Sons, 
Theorem 2, p. 325. 
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one originating from the upper branch of the equicapacity 
curve (above the line 8 = qa), and the other originating 
from the lower branch (below the line 8 = a). The portion 
of the (A, B) curve which originates from the upper branch 
of the equicapacity curve has no intercepts on the a- anc 
B-axes; we shall call it the wpper branch of the (A, B) curve. 
The portion of the (A, B) curve which originates from the 
lower branch of the equicapacity curve has intercepts on 
the a- and 6-axes; we shall call it the lower branch of the 
(A, B) curve. Eqs. (22) and (23) show that both branches 
of the (A, B) curve are symmetric in the line 6 = 1 — a. 
Moreover, since 


AQ — eh a), 
Ba ey a), 


A(a, 1 —a) = 
Baa, l1—a) = 
the two branches contact on the line 8 = 1 — a. 

These facts are illustrated by Fig. 6, which shows three 
(A, B) curves, those generated by squaring the channels 
with capacities 0.1, 0.4, and 0.7. The (a, 8) values corre- 
sponding to these channels were read off the corresponding 
equicapacity lines of Fig. 2. Note that all the (A, B) curves 
lie below the line 8 = a. This is because 8 > a@ implies 
B(a, B) < A(a, 8), whereas 6 < a implies B(B, a) < 
A(B, a). 


Fig. 6—(A, B) curves corresponding to the channels of capacities 
0.1, 0.4, and 0.7. 


Of all the channels with the same capacity, the two 
corresponding to the end-points of the upper branch of 
the (A, B) curve have the smallest capacity under cascade. 
Moreover, any channel on the upper branch of the (A, B) 
curve has lower capacity than its images (under multipli- 
cation by J) on the lower branch. However, we can avoid 
the low capacity in cascade exhibited by these channels 
by an extremely simple intermediate station behavior, 
namely by crossing the connections between the outputs 
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yf one channel and the inputs of the next. In this way we 
rrive on the lower branch of the (A, B) curve, which has 
igher capacity in cascade. (It will be recalled that in the 
receding section we avoided probabilities of error greater 
han 3 by exactly the same expedient.) We shall comment 
on the significance of this intermediate station behavior 
I low. 

| The problem thus reduces to finding the channel on the 
jower branch of the (A, B) curve with the highest capacity. 
As in the preceding section, we resort to a superposition 
pf curves, this time superimposing the curves of Figs. 2 
and 6. We find that a symmetric channel with a given 
tapacity has a higher capacity under cascade than an 
asymmetric channel with the same capacity, unless the 
thannels have very low capacity. As an example of this 
exceptional behavior at low capacity, we cite again the 
two channels 


ty 
| 


1/2 + (€/2e)'”? 1/2 — (€/2e)'” 
1/2 — (e/2e)"* 1/2 + (e/2e)'” 


which both have capacity of (€ loge)/e bits. The squares 
of the matrices (24) are 


tei — 


| , (24) 
we 1 


ele 1/2 + ¢/e 1/2 — e/e 
0 5 1/2 —eé/e 1/2 + €/e 
respectively. The corresponding capacities are 
Gale eit, (25) 
and 
AG ee Z loge e’ bits (26) 
Thus 
CG ae ] > Ces Ih 
as asserted.’ 
We can now answer both questions posed at the 


beginning of the preceding section as follows: Of all the 
binary channels on the lower branch of a given equi- 
capacity line, the symmetric channel has the smallest 
probability of error and the largest capacity in cascade, 
unless the capacity is small, in which case the asymmetric 
shannel (with one noiseless symbol, 7.e., 8 = 0) has the 
smallest probability of error and the largest capacity in 
rascade. Continuity requires that there be a small range 
of values of the capacity for which asymmetric channels 
vith 8 * O are superior in these two respects, but we 
ave made no detailed study of this intermediate case. 
There is no point in trying to establish a preference among 
he channels on the upper branch of the equicapacity line, 
ince by reversing the designation of the output (or input) 
ymbols we arrive at a channel with a smaller probability 


7Wqs. (19), (20), (25), and (26) suggest the conjecture that the 
apacity in cascade of channels with the same low capacity is in- 
ersely proportional to their probabilities of error (see also (32) in 
he Appendix). 
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of error and a larger capacity in cascade. (An exception 
to this statement in the symmetric case is noted below.) 

A particularly striking example of the difference between 
the capacity in cascade of a channel on the upper branch 
of the equicapacity line and its images on the lower branch 
is afforded by comparing the two e-channels 


lente 


1 0 


: Cl= C1 = 


0 1 


C’, is on the lower branch of the equicapacity line (with 
capacity (e loge)/e bits), and C’, obtained by reversing 
the designation of the output terminals of C., is on the 
upper branch. The squares of C, and C/ are 


Cee 
0 1 
and 
Cit a aan Sa 


l—e € 


The corresponding capacities are 
] : 
fC] = “ES e bits 


and 


loge 3 
a Boe Ae 


clas bits, 
so that, in this extreme case, the capacity of one channel 
is an order of magnitude less than that of the other. 

DeSoer* has emphasized the importance of proper 
intermediate station behavior in maximizing the end-to- 
end capacity of a cascade of channels. In particular, he 
compares the capacity of a cascade of continuous channels 
perturbed by white Gaussian noise with that of a cascade 
of PCM channels with the same signal-to-noise ratio. It 
is assumed that in the cascade of continuous channels the 
intermediate stations retransmit the received waveform 
without change, whereas in the cascade of PCM channels 
requantization occurs at each intermediate station. 
Although the continuous channel has a higher capacity 
than the PCM channel, the PCM channels deteriorate 
less in cascade. Thus, for some cascade length depending 
on the signal-to-noise ratio, the cascade of PCM channels 
has a larger end-to-end capacity than the cascade of 
continuous channels. 

We have an even simpler example of how proper inter- 
mediate station behavior can preserve the end-to-end 
capacity of cascaded channels. For, in the example just 
given, if the output symbols of the C? channel are reversed 
at the intermediate station, we have 


CICA -C=<clC_ |e ela 


On the other hand, the capacity of a cascade of symmetric 
channels is completely insensitive to whether the identity 
of the symbols is preserved or reversed at the intermediate 


26 IRE TRANSACTIONS—INFORMATION THEORY 


stations. Graphically, this is a consequence of the point of 
contact of the two branches of the (A, B) curve on the 
symmetric channel line 8 = 1 — a (See Fig. 6). 

Suppose we regard the behavior of the intermediate 
station as a detection scheme. Then preserving the 
designation of the output symbols of the preceding 
channel at the intermediate station is minimum probability 
of error detection® if the channel lies in the square region 
a > 4,6 < 4, and reversing the designation of the output 
symbols is minimum probability of error detection if the 
channel lies in the square region a < 3, 8 > 3. If the 
channel lies in the square region a, 8 < 3, then minimum 
probability of error detection requires that both zeros 
and ones be changed to ones, and information-destroying 
mapping. Similarly, if the channel lies in the square region 
a, @ > 4, then minimum probability of error detection 
requires that both zeros and ones be changed to zeros. 
Thus it is apparent that maximum rate in cascade and 
minimum probability of error detection at intermediate 
stations are not always compatible. (DeSoer’ gives a 
complicated example that illustrates this fact.) If in- 
formation-destroying mappings are precluded, as they 
must be if maximum rate is the objective, we conclude 
that the larger capacity in cascade is given by minimum 
probability of error detection, which requires that the 
identity of the symbols be preserved at the intermediate 
station if the channel lies below the line 6 = a, but 
reversed if the channel lies above the line 8B = a. 

The proofs of the statements of the preceding paragraph 
are left to the reader. It is merely necessary to examine 
the expressions for the probabilities of error of each of the 
four delayless detection schemes and ascertain which is 
smallest in each square region. 


APPENDIX 
Redundancy Coding in the «Channel 


In our discussion of cascaded channels in the last 
section we considered only delayless operation of the inter- 
mediate station. If sufficient intermediate station delay 
is allowed, it follows from Shannon’s second coding 
theorem that the end-to-end capacity of a cascade of 
identical channels can be made arbitrarily close to the 
common capacity of the separate channels. Studies of 
probability of error and rate as a function of delay are 
still in progress,’ and it is perhaps too early to apply the 
theory to cascaded channels. However, the e-channel is 
susceptible to a simple type of redundancy coding, which 
is effective Just because of its low capacity. This re- 
dundancy coding, although not ideal in the sense of 
achieving capacity with a vanishingly small probability 
of error, nonetheless achieves a rate which is an appreciable 
fraction of capacity with a small probability of error. 


8 This is sometimes called maximum a posterior? probability detec- 
tion or the ideal observer. 

’ Reference is made to recent work by C. E. Shannon and by 
P. Elias, presented at the March, 1955 National Convention of the 
IRE 
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Moreover, it serves to illustrate how delay can be ex- 
changed for enhanced rate in a cascade of channels.'° 

In the redundancy coding to which we refer, each trans- 
mitted digit is repeated r times, and the receiver decides 
whether a zero or a one was sent by examining sequences 
of r digits. More specifically, let the channel have matrix 


0 1 


with capacity (log e)/e « bits. Have the receiver examine 
the output in blocks of r symbols (properly synchronized 
with the transmitter) and decode a block of r ones as a 
one and a block of r digits with a zero at any position as a 
zero. In other words, the symbols 0 and 1 are mapped) 
into the sequences 00 --- O (7 times) and 11 --- 1 (@ 
times) at the transmitter, and the events S and F are 
mapped into 0 and 1 at the receiver, where S designates 
the appearance of a zero in a block of r digits, and F the 
nonappearance of a zero in a block of r digits.’* Trans- 
mission can then be regarded as taking place in an 
equivalent channel C(r) with matrix 


1-(-6’ (-&’ 
0 1 


Cir = 


The capacity of C(r) is 
7 yim Sui 2) 
c[C(r)] = log E + exp, (- eee pe 
and its probability of error is 


BP) =e) aay 


where by P(r) we mean the probability that the sequence 
00 --- O (r times) should be transmitted if the capacity 
c[C(r)| is to be achieved. P,(r) is not the same as P, for 
the e-channel, as given by (13) of the first section. 

Suppose that each of the transmitted symbols is repeated 
r = n/e times. Then, since (1 — «)”* ~ e~” for small «, 
the equivalent channel becomes 


‘Cea Cme 


0 1 


C= 


with capacity 
eIC@nlenion E tise (He =) ], (27) 


l—e 


input symbol distribution 


Pon) = (1 — ey + exp, (Hoa 2)", (28) 


€ 
and probability of error 


Pn)s= sea Rowe (29) 


10 Tentative studies of the effect of delay in cascaded channels 
have also been made by DeSoer?. 

4 Of course, redundancy coding is also effective in a low capacity 
symmetric channel, but the analysis is more complicated, since now 
the events S and F refer to receiving more zeros than ones in a block 
of r received digits, and vice versa. 
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Of course, before comparing the capacity of the redundant 
hannel with the capacity of the «channel, we must 
1ormalize (27) by dividing it by r = n/e, since only one 
nformation digit is transmitted every r units of time. 
oreover, we must now accept a delay of r units. The 
interesting result (displayed in Table I at the end of 
this appendix) is that even when properly normalized, 
he redundancy code (which is, after all, a very simple 
code) gives rates which are an appreciable fraction of the 
capacity of the e-channel, with a probability of error which 
becomes smaller as we tolerate more delay. (Unfortunately, 
he capacity goes to zero with the probability of error, 
which is not the case for ideal coding.) Note that as more 
redundancy is introduced, C(n) becomes a better approxi- 


mation to the unit matrix, and P,(n) approaches 3. 


| TABLE I 


n c[C(n)] c[C(n)](€/n) Po(n) P.(n) 
1 0.436 0.4366 0.413 0.152 
2 0.707 0.353¢ 0.448 0.061 
3 0.858 0. 286¢ 0.472 0.024 
4 0.934 0. 2346 0.487 0.0089 
5 0.971 0.194¢ 0.493 0.0033 
6 0. 987- 0.165¢ 0.497 0.0012 
7 0.994 0. 142¢ 0.499 0.00045 
8 0.998 0. 125¢ 0.500 0.00017 
9 0.999 0.11le 0.500 0.00006 
10 1.000 0.100¢ 0.500 0.00002 


(For no redundancy: c[C] = 0.531le, Po = 0.368, P. = 0.368) 


Illustrating the redundancy-coded e-channel. A table of the quan- 
tities given by (27), (28), and (29), corresponding to a redundancy 
i) Se 


If C(n) is cascaded N times, we raise the matrix C(n) 
to the Nth power: 
d—-e")” 1-(1-—e")”* 
0 1 


The capacity of the product matrix is still normalized by 
dividing it by the per-stage delay r = n/e, but the over-all 
delay that must now be tolerated is Nr. Assume that 7 is 
large enough so that e~” is small and c[C(n)] ~ 1.0, and 
suppose that we agree to let c[C”(n)] fall off only to the 
value k, < 1. This amount of deterioration occurs when 


(1 —e*)" = f(k,), (30) 


where f(k,) is the abscissa of the point of intersection of 


ow 
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the equicapacity line k, with the a-axis (see Fig. 2). If 
we assume that NV, the number of cascaded stages, is large 
(30) becomes 


exp (—Ne™) = f(k:) 
or 
—log f(k:) = k, > 0. 
Thus 
N = k,e”, (31) 


2.€., we can tolerate more channels in the cascade if we 
increase n, and consequently the error-proofing and delay 
per stage. Moreover, since 


Pin) ~ 6", 


(31) can be rewritten as 


et ae ke ° 


N (32) 
Eq. (82) says that the amount of cascading permissible 
to within a given tolerated deterioration of the end-to-end 
capacity is inversely proportional to the probability of 
error per stage.” 

If a noiseless feedback channel is available at the 
receiver, the rate can be increased by the simple expedient 
of having the receiver instruct the transmitter to begin a 
new run of r repetitions whenever a zero is received. For, 
since received zeros can only originate from transmitted 
zeros, it is a waste of channel space to continue repeating 
zeros for the rest of the run of r digits when a zero has 
already been received. In the limit of high redundancy 
this feedback procedure increases the rate by a factor 
approaching 2, because without feedback almost half the 
channel space is taken up by the needless repetition of 
ZeYOs. 
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Minimum Energy Cost of An Observation 


F. P. ADLER 


Summary—The minimum energy expenditure required in per- 
forming basic observations and measurements is analyzed. The 
energy cost, in ergs per binary unit (bit) of information, is found for 
three fundamental cases using idealized experimental procedures: 
1) the determination of the presence (or absence) of an input signal 
on an indicating instrument, 2) the measurement of a time interval 
and 3) the measurement of a distance. The variation of energy cost 
with the reliability and accuracy of the experiment is determined; 
it is found that with a suitable procedure the minimum value of 
kT 1n2 ergs per bit predicted by the Second Law (interpreted so as to 
include informational entropy) can be approached arbitrarily closely 
under conditions of small reliability and high accuracy. The present 
results are compared with those derivable from C. E. Shannon’s 
equation for the capacity of a communication channel. 


INTRODUCTION 


HE FACT that an observation requires expenditure 
of energy and hence an increase in entropy has 
been discussed by several authors’~'* who have 

also pointed out the resulting fallacy in the working of 
Maxwell’s demon. In the present paper we are concerned 
with finding quantitatively the minimum energy cost of 
an observation or experiment, in ergs per bit of informa- 
tion gained. Our point of departure is Brillouin’s article’ 
in which he points out that in any experiment the sum 
of entropy minus information’* can never decrease. The 
efficiency of an observation, which he defines as the ratio 
of information gain (A/,,) to entropy increase (AS), can 
therefore never exceed unity. Brillouin then illustrates this 
point using certain idealized basic measurement proce- 
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dures. The following analysis is largely an extension and 
amplification, in quantitative terms, of the above paper. 
Although cost (c), rather than efficiency, will be considered 
here, the latter may readily be found from the relation 


KAI In2 kT n2 [40 Way & 
ET Si Cc 


NIE yp = 
SR 


(1) 


where AJ,, is in thermodynamic units and AT is in bits; 
E denotes the added energy and 7 the absolute temper- 
ature. An immediate consequence of this conversion 
formula is seen to be that 


ee 151 In Y 


(2) | 


represents the lower bound on information cost, corre- 
sponding to the ideal case of unity efficiency. 


ergs per bit 


OBSERVATION OF SINGLE OSCILLATOR 


We shall first consider the case of taking a reading on 
an instrument such as a ballistic galvanometer to de- 
termine whether or not a certain signal is present. The 
indicating mechanism (needle and spring) constitutes 
an oscillatory system (natural frequency v) subject to 
Brownian motion. Following Brillouin, we may therefore 
consider the observation of the needle position equivalent 
to determining the energy state of a harmonic resonator 
of frequency v with quantized energy levels nhy. Now 
suppose that a reading corresponding to a_ particular 
energy level mhy is observed. From the Boltzmann 
distribution law the probability that the resonator would 
attain this level due to thermal fluctuations only is 


py(m) = Ke™'/' (3) 


where K = | — exp (— hv/kT) is a normalizing constant 
such that >>°., p(n) = 1. If, on the other hand, an 
input signal has actually been applied to the resonator, 
then the probability of this level becomes 


pi(m) = Ke eke (m > s) (4) 


where s is the number of added signal quanta. Now the 
information gain, AZ, will be the difference between I, 
and J,, the information functions before and after the 
observation. Assuming the maximum a priori uncertainty, 


that is, signal and no signal equally likely, we have 
Io = logy 2 = spite (5) 


while 
gl 1 
I, = ps logs as + pw log, ae (6) 
where ps is the a posteriori probability that the observed 


energy level mhy was indeed due to a signal and py that 
it was a spurious indication caused by thermal noise. 
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ayes’ theorem is now used to find p, and py: 


7 
px = PHI) = Seam +o) | 
Bs = wlth /E) = Spat) + ob Hyp) 

where 

EB ee observed event (m quanta); (8) 

lH Ree eS hypothesis that signal is not present; 

A. ph eee hypothesis that signal is present; 


\p(H,), p(H,).a priori probabilities of Hy and H, (assumed 
equal to 1/2); 


. probability of occurrence of if H, is true = 
po(m) = K exp (— mhv/kT); 


| p(E/H,) .... probability of occurrence of EF if H, is true = 
| pi(m) = K exp [—(m — s)hv/kT]. 


1 
Ne shv/kT 
1 = é (9)? 
- 1 
(DS tae ie ae 
‘The information gain is therefore 
Al(shvy/kT) = AI(S) 
‘ = : S 
7; log, (1 =e) _ log, e+ €) (10) 


feu liecievee 


where S = shv/kT. Since it requires s signal quanta or 
SkT ergs to produce AJ, the cost of the observation is 
seen to be 


Sk 


T J 
Ca AI(S) ergs per bit. (11) 


AI 
A plot of c is shown in Fig. 1. The minimum cost is 
about 4.1 kT or 4.0 X 10°“ erg per bit at room temper- 
ature, corresponding to a maximum efficiency of 0.17. 
It occurs for a signal input of 2.57 kT’. It is interesting to 
compare this value with the value of 4 k7’ considered by 
G. Ising’® to be the minimum input required for a reliable 
signal indication; it is seen that high reliability must be 
paid for by increased cost and lowered efficiency. 


Time MEASUREMENT 


Here we shall take as our experimental model an ideal- 
ized electronic timing device of the type suggested by 
Brillouin’®: a number (N) of receivers are connected to a 


15 Tt is of interest to note that because of the exponential nature of 
the Maxwell-Boltzmann distribution, py and ps, turn out to be ac- 
tually independent of m. If this were not the case, the cost would need 
to be averaged with respect to m to find the mean cost over all possi- 
ble observed states. 

16 G, Ising, ““A natural limit for the sensibility of galvanometers,”’ 
Phil. Mag., vol. 1-7, p. 827; April, 1926. 
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line from which a signal pulse is expected to come; the 
receivers are successively turned on, each for a time 7, 
and the occurrence in time of the pulse (relative to a 
zero-time reference pulse) is determined by noting which 
of the N receivers shows an output signal. The switching 
on of the receivers may be performed by means of timing 
pulses which change bias voltages on diodes or vacuum 
tube grids; this operation need not, in principle, require 
any energy expenditure. The energy of the signal pulse, 
however, is dissipated upon arrival at a receiver where 
it is absorbed and presumably recorded. 


nN 


ENERGY COST PER BIT, IN UNITS OF KT 


0 | 2 3 4 5 6 73 8 3 Ke) 


THRESHOLD ENERGY IN UNITS OF KT 


Fig. 1—Energy cost of observation on single resonator. 
Threshold energy is a measure of reliability of observation. 


We shall consider an observational procedure in which 
a detector is considered as having received the signal 
pulse if its energy state equals or exceeds a certain 
threshold value, Z, = n,hv*’. If only one detector shows 
excitation above this level, the information gain is clearly 


AI = log, N (12) 


assuming that initially all NV receivers were equally likely 
to receive the signal pulse. The same result may of course 
also be obtained using the a priori distribution of the 
pulse occurrence time ¢, po(f) = 1/Nzr, and its a posteriori 
distribution, p,(¢) = 1/7, giving 


Nr T 
IN / Po logs ai / Pp: logs d= log, N. (13) 
0 Po 0 Pi 


If m receivers, rather than just one, are observed to 
exceed n, quanta, a unique measurement is not possible, 
but at least the possible values of the time interval have 
been narrowed down from N to m. Although this may 
perhaps be of somewhat questionable value to the ex- 
perimenter, the information gain is clearly 


Ae aa log, (N/m). 


Now the probability that a receiver will equal or exceed 
the threshold level due to noise alone is 


(14) 


Tt may be thought that an-increased amount of information 
could be obtained if instead of setting a threshold level, the actual 
amount of energy on each resonator were observed immediately 
subsequent to its active interval and inferences drawn from these 
values as to the likelihood of any detector having received a signal. 
As shown in the Appendix, however, such an experimental procedure 
turns out precisely equivalent to the one considered here. 
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= 2 Ke nhyv/kT — e A (15) 


n=nt 


p(n > n,) = Se p(n) 


where A denotes the normalized energy threshold n,hy/kT. 
The probability of a total of m receivers giving an indi- 
cation is therefore 

ONE ae 


Pm ae ‘4 


(Nu m)\(m — i)! 


af N =< i) —(m-1)A —A\N—m 
az B as f oe 


bes Dn = 1) 


which is the product of the probability that (m — 1) 
receivers will indicate a signal and the probability that 
the remaining (NV — m) will not, multiplied by the number 
of ways of realizing this situation with the (VN — 1) 
receivers which do not receive the signal pulse. The 
average information gain is then given by 


ii er m 


e Dt = 


(16) 


N 


AI = >) pm log, (N/m). 


m=1 


(17) 


The necessary signal energy is determined by the re- 
quirement that the signal pulse should always produce 
an indication, that is, equal or exceed the threshold 
energy E,; clearly, a signal energy equal to /, will produce 
this effect, even if the receiving resonator should happen 
to be in a zero quantum state. The minimum cost of the 
time observation is therefore 


(18) 
N N — 1 : =1 
= uf Ss (: jeoe ws eS ae log. vim | z 


=7 \m = 1 
. 2 show a plot of c, versus N for three values of E,: 
E, = 4kT (Ising’s value), E, = kT In 2 (for which 
pn > n:) = pn < n,) =-1/2) and-H, — 0..1t canbe 
shown that ¢, is minimized for L, — 0, giving 


kT 


ein o, = (19) 
Ne (x a i 
corresponding to an efficiency 
max 7 = (N — 1) in ( _ ). (20) 
Ka N-1 


For large values of N the expression for c, approaches the 
ideal value kT In 2, in agreement with (2), while 7 ap- 
proaches unity. A proof (not shown here) can be given 
that these limits are also approached, though more 
slowly, for all finite values of #,; this behavior may be 
seen indicated in Fig. 2. 


LenctH MEASUREMENT 


We shall consider here the problem of determining the 
distance to some test object known to be in the field of 
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Fig. 2—Inergy cost of basic time determination. 
Number of divisions (V) is a measure of accuracy of measurement. 
Signal or threshold energy (Ak7’) is a measure of reliability of meas- 
urement. Dotted curve is based on Shannon’s channel capacity 
formula (for A = 4). 


observation. We first analyze an experimental procedure 
analogous to one described by Brillouin’®. The total length 
L is considered to be divided into N = L/AL intervals, 
where AL is the desired accuracy of measurement. In 
order to discover now which one of the cells AL contains 
the particle, a small beam of light is directed at each one 
in turn. A cell is then presumed to be empty if a resonator 
placed behind it is excited by the light. If the resonator 
does not show an indication, however, it must be in the 
shadow of the particle which has absorbed the light. We 
again set a threshold level L, = n,hv and consider that if 
an energy state nhy is observed on a resonator, the 
resonator has received the light beam provided n > n, 
(cell empty), and has not if n < n, (cell contains the 
particle). In order that a resonator behind an empty 
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ell will always show an n > n,, the light beam energy 
sed in scanning one cell is taken equal to F,. 

For the experiment to be successful the resonator behind 
he particle must not exceed /, due to thermal fluctuations 
t the time of observation. Otherwise no information at 
ll is obtained since the cell containing the particle is 
adistinguishable from the N — 1 remaining cells. The 
canning process may therefore have to be repeated once 
r several times before the particle position is determined. 
‘he probability of finding it on the first scan is, using (15), 


pn <n.) =1—e%. (21) 
‘or finding it on the second scan only, it is 
p(n > npn < n,) =e “(1 — e%). (22) 


‘nally, the probability of discovering the proper cell at 
he kth trial, after & — 1 unsuccessful scans, is 


Diez) poy <ain joe (=e)? )),, * (23) 


Now on each unsuccessful scan, Nn, quanta are expended, 
vhile the average cost of the (final) successful test is 


1 Spal 
2 


14+24+3+ ---+N)n,/N = nm, quanta (24) 
ince each cell position is equally likely. The average 
otal energy expenditure for a successful observation 


n < n,) is therefore 


i= >. Ce las aC —)N+ MY Ne, 
k=1 


2 
| 


= (N + 1/2 + N/e* — DIE.. 


(25) 


Once n < n, is observed for a particular cell, that cell 
s known to contain the particle. The a postervort un- 
rtainty, /,, is therefore zero and hence the information 
ain is 

AT los, NG (26) 


‘he average cost is therefore 


ee en 2S Nel) 


AkT ergs per bit. 


ie AL log. N 
(27) 
t can be shown that c, minimizes for A — 0 giving 
N 
hil (Qu ees ee A 28 
min ¢, ona (28) 


Plots of cz, again for H, = 4kT, kT In 2 and E, — 0, 
re shown in Fig. 3. Comparison with Fig. 2 shows that 
rhile c, and c, are roughly comparable in the region of 
yw accuracy (small N), the length measurement cost 
ecomes increasingly large as the accuracy is increased. 
‘he reason for the relative inefficiency of the length 
bservation is not difficult to see: the number of signals 
snt out and absorbed is of the order N, while in the case 
f the time measurement only a single signal pulse is 
volved. The high c, values found are not believed to be 
asic to the problem of length measurement, however. A 
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more efficient procedure, for example, would be to reduce 
the distance to a time measurement analogously to the 
operation of a pulse radar set: a pulse is sent from 0 to 
the particle position P and the time interval between 
transmission of the pulse and receipt of the reflected 
pulse is observed as described in the section on time 
measurement; the distance OP is then obtained from a 
knowledge of the speed of light. 


8 


ENERGY COST PER BIT, INUNITS OF kT 


14 18 


2 6 10 
NUMBER OF DIVISIONS,N 


Fig. 3—Energy cost of a length determination. 


DIscUSSION 


It is of interest to compare the results obtained in the 
section on time measurement with Shannon’s classical 
channel formula. In slightly modified form, this relation 
states that the maximum information gain obtainable in 
a time ¢ with a signal energy / and a channel bandwidth 
W is 

AI = Wi log, (1 aft LD) bits (29) 

NW 
where ny is the noise power spectral density which has the 
minimum value k7’. The minimum energy cost is therefore 


2b 


( 2h 
loge ! af NkT 


Cs = ergs per bit (30) 


where 2 Wt, the number of independent samples which 
can be transmitted in time ¢ with bandwidth W, has been 
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denoted by N, since this number is roughly analogous to 
the N (number of detectors or divisions) used in the two 
preceding sections. For NV > ©, cs is minimized to 


(31) 


mincs = kT In 2 ergs per bit 


a result which has been pointed out before’®. A plot of cs 
is shown in Fig. 2 for FE = 4kT. Although the precise 
significance of this curve for the small N-values shown 
may be open to some discussion since (29) holds strictly 
only for a long signal sequence, it does bring out the 
higher efficiency obtainable with (ideal) coding than with 
a straightforward measurement procedure.” 

In addition to the quantitative agreement between 
(31) and the result obtained in the section on time 
measurement, there is also a broader similarity between 
Shannon’s information cost (30) and that derived here 
(18). In both cases, cost is minimized by reducing the 
reliability of a received signal indication to a minimum, 
by using a vanishingly small value of signal energy (H — 0), 
and by making the number of independent samples or 
divisions very large (2Wt - 0, N —.-). As a practical 
matter, these low-cost conditions are usable for communi- 
cation systems employing efficient wide-band coding and 
modulation schemes; for the case of physical measure- 
ments, however, they are not the ones commonly used in 
the laboratory since awkward repetition procedures would 
be required to obtain significant amounts of reliable 
information. There thus appears to be an important 
distinction between communication systems which allow 
ingenious coding of the information, and measurement 


18 See, for example, J. A. Felker, “A link between information and 
energy” Proc. IRE, vol. 40, p. 728; June, 1952. Felker, however, 
associates # with the battery or power supply energy required to 
amplify a signal. This is in disagreement with the more fundamental 
viewpoint adopted here, as well as in the paper by Brillouin, who 
states in connection with the timing scheme of the section on time 
measurement: ‘“We may think of using a tuned receiver, but usually 
an amplifier is needed. This is just an auxiliary device to increase the 
power and we are not going to consider the energy required in the 
amplifving system itself” (J. Appl. Phys., vol. 24, p. 1161; 1953. 
Cf. also F. P. Adler, ““Comments on ‘Figure of merit for communica- 
tion devices’,” Proc. IRE, vol. 42, p. 1191; July, 1954. 

19 For an interesting discussion of the differences between com- 
munication and measurement systems see P. H. Blundell, ‘“The 
definition of rate of information in the presence of noise,” p. 27 of 
“Communication Theory” (papers read at a Symposium on Com- 
munication Theory held at London), Butterworth’s Scientific Pub- 
lications (1958). 
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procedures in which the quantity to be determined has t 
. : : 10 
be measured in pretty much its nature-given form. 


APPENDIX 
Time Measurement Without Threshold Level 


As was stated in a footnote to the text of the sectio 
on time measurement, it may seem reasonable that mor 
information could be extracted if, instead of merel 
observing whether or not the energy of a resonator ex 
ceeded some threshold level, the actual energy valu 
were to be noted. Suppose, then, that an energy leve 
n;hv is observed on the 7th detector. The probability 
observing this level due to noise alone (hypothesis Ho) 1 


p(n; /Ho) = Ke 


If the detector has received the signal (hypothesis H,) 
this probability becomes 
Ket for n; > 8 
poy 


0 Or fx <S& 


where shy is energy of signal pulse. Using Bayes’ theorem 
and assuming a prior? probabilities p(H,) = p(H,) = 1/2 
the following a posteriori probabilities are obtained: | 


1 “(“ 
> 
p(y /n;) — 4 1 ails giny ee if Nn; ZA Ss 
! Ih) eons 
il : 
a i 
w(H,/n) = 11 4 em if n;>°s 
p if ips | 


The observation of all Nn,’s will therefore divide th, 
resonators into two groups: those which are known not t, 
have received the signal (n,; < s) and those which ma; 
have received it (n; > s). Since p(H,/n,) for n; > s doe 
not depend on n;, the detectors in the second group ar; 
indistinguishable from each other; an observed energ: 
level (s + 10°) hy is no more indicative of a receive 
signal than a level of shv. Signal energy value acts effec 
tively as a threshold or dividing level and rest of analysi 
proceeds as in the section on time measurement. [ 
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Some Remarks on Statistical Detection’ 


We CeROOTy, AND T 


Summary—A particular type of communications detection prob- 
lem is considered: the problem of specifying a detector to decide 
which one of two sure signals is being transmitted when the signals 
are perturbed randomly both by Gaussian noise and multipath 
transmission. If the delays in the various channels are known, but 
the strengths are random, a maximum likelihood detector may be 
specified by methods which are a simple extension of known meth- 
ods. If the delays are random, the problem is more difficult. One 
possible solution is first to estimate certain channel parameters from 
the received signal and then to use these estimates in a likelihood 
test. It is shown how to make consistent unbiased estimates for ap- 
propriate channel parameters under certain assumptions on the 
aature of the signal. 


INTRODUCTION 


WesN THE LAST few years the general problem of 
| detecting signals in the presence of noise has been 
| recognized in its idealized form to be statistical, 
i a good deal of theoretical work has been done on it 
(see e.g. [4] and [5]. One common type of detection 
problem is to establish a criterion for determining whether 
or not a known signal is present in noise, another is to 
establish a criterion for determining which one of a 
sollection of known possible signals is present in noise. 
Usually the noise is represented mathematically as a 
stationary Gaussian stochastic process independent of 
che signal which is simply added to the signal by the 
shannel. In this paper we look at problems of the latter 
sype but we consider cases where the signal is randomly 
perturbed in other ways than just by the addition of 
aussian noise. The particular communications problem 
vhich motivated this work is that of receiving signals 
which are sent in a two-letter alphabet) over a radio 
shannel in which the signals are distorted both by additive 
1oise and by multipath propagation with in general random 
or at least unknown) characteristics. Before formulating 
ny examples, we discuss briefly and very generally the 
oint of view which we adopt in treating them. 

Let X,(¢), X,(¢) be real-valued functions of a real 
ariable ¢, O < t S$ T. These functions represent two 
ossible known signals one of which is to be transmitted 
uring a time interval of duration T. Let P represent a 
hannel randomly perturbing the transmitted signal. We 
sk for a procedure which will, from a knowledge of 


ss = PX), 1 Ot 


ead to a reasonable judgment as to whether 7 is equal 
0 0 or 1; 7.e., we ask for a detector which will tell as well 


* The research in this document was supported jointly by the 
Army, Navy, and Air Force under contract with the Massachusetts 
Institute of Technology. 
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as possible from the received signal which of the two 
possible transmitted signals was actually sent. 

In statistical language, s(é) is one of two stochastic 
processes, depending upon whether 7 = 0 or 1, and we 
want a statistical test which will allow us from observing 
a realization of s(¢) to decide which process it belongs to. 

This problem is one of testing statistical hypotheses 
and making a decision. We do not here want to enter into 
the difficult matter of optimum statistical decisions, but 
rather want to discuss tests for certain specific cases with 
the understanding that the method of making the decision 
is fixed. Our detection procedure can be described in 
general as follows: Let (z,, 22, ---) be a finite or countably 
infinite set of measurable functions on the underlying 
probability space, 7.e., under hypothesis 7, 7 = 0 or 1, 
(Ziawten -) is a vector-valued random variable. Let 
probability densities fo(2, ---, 2), fi(é, a) ners 
defined for both hypotheses for all » if there are infinitely 
many 2’s and for n equal to the number of 2’s if there are 
finitely many. Form the likelihood ratio 


fols ARioes » &n 


[fe ges 
. files es alae rea 


and suppose it converges to lif n — o. Then a realization 
s(t) of P(X;(t)) will assign values to the 2’s and hence to 
l; choose 2 = O or 1 according as /is > or < 1. Following 
Grenander’s [1] terminology we call the z’s “observable 
coordinates’’. 

In the first section we consider detection from the point 
of view just outlined for the eases in which the received 
signal s(¢) is given by 


1) sf = 2,(f) + rn), 1 
2) s(t) = 2,(t) + ax(t — mm) + nib), 1= 0%, 


where n(t) is a stationary Gaussian process with mean 
zero and a is a random variable independent of n(é). The 
first case is known but since it applies to what follows we 
have summarized the pertinent results. We have followed 
Grenander [1] in doing this. We have also shown that a 
particularly simple type of singular case of (1) does not 
arise if n(t) is filtered white noise. The second case can be 
easily generalized to one with more delay terms. 

In the second section we again consider case (2) and 
then the much more general case, 


I 
=) 
= 
= 
¢ 


done [et Se EC nee, 


where dF(r) is a Stieltjes measure. In this section we 
regard the channel parameters (@ and df’) as having 
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unknown probabilistic behavior and the technique is to 
find consistent linear estimates for them from s(t) and 
use these estimates to simplify the statistical hypotheses 
test problem. 


LIKELIHOOD TEST SIGNALS WITH 


Known DELAY 


I. Maximum FOR 


In this section we first review briefly the problem of 
choosing 7 when s(t) has the representation 


s(t) = a(t) + n(), OR <eriaela 


n(t) being stationary Gaussian noise with mean zero and 
autocorrelation r(s, t) = p(s — t). Let {¢,(t)} be a set of 
orthonormal eigenfunctions and {),} the corresponding 
eigenvalues of 


i= 0,1, 


a 
d | 1, De dt = 96). 
Then by a theorem of Karhunen and Loeve [6, pp. 327; 
328] 


ND)e= il iema, > zx¢x(t) 


where 


we [ RRO: 


The z,’s are random variables which satisfy 


1 
E {22m} = Bim 
k 


E {z,} a 0. 


where the bar denotes complex conjugate, /{-} denotes 
mathematical expectation or average and 6,,, is equal to 
1 if k = m and zero otherwise. Now since n(t) is a Gaussian 
random process the z, have a joint Gaussian distribution 
and hence are independent. We introduce the notation 
GQ. = (¢%, %) and b, = (¢, 71), where 

fhe 


(g., %) is the “inner product” gx(t)axo(t) dé, 


and use as observable coordinates 
Y= Gs =a +e if 7~=0 
— by ek if 


G. = (vx, 8), Where the ¢,’s and y,’s form a complete or- 
thonormal set. 

If there is a function y(t) orthogonal to {¢,(t)}, 7.e. 
(vy, o) = 0,k = 1, 2, ---, for which (x, ¥) 4 (i, ¥), the 
problem is said to be extreme singular. Then since y(¢) is 
orthogonal to n(t) with probability one, the statistic 
(s, Y) is equal with probability one to (a, w) if 7 = 0 and 
to (a,, y) if 7 = 1, thus providing perfect detection. In 
practice however {g,(t)} usually spans the whole space so 
the extreme singular case cannot happen. This is so in 
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particular if n(é) is filtered white noise (see Appendix | 

We consider now the remaining case, in which t 
difference of the mean-value functions, x,(f) — %o(t), 
in the Hilbert space spanned by the eigenfunctio 
{o,(t)}. If py is orthogonal to {¢,} then (s, ) is a consta 
independent of 7, so we need only consider the observak 
coordinates (s, ¢,),k = 1,2, .... 


Now 
Ely, = a, 0h) a= 0 
=. by aif. sted 
var {y,| =) ValG,\.= - for 1= 0_or 21 = 


where var {y,} means the variance of the random variak 


Ure 
Hence 
A ee oNe ee 
fol es bare Yn) = aC ual exp ey Dd NY =a Ay) 


and fi(yi, «++; Yn) is the same except b, replaces a,. Th 


agate 
= exp = > di(De — ae) — > Yuan — boas 
Let | 
f= Dla = W)rgld, 
then 

log Ins) = 5 > Xa wees toe > »(; 

=5 UMa — b) f rea. oe 
- Yala — 0) f “eds 
- | OILERS s(d) at. | 
If 1 


DG; =") Se 


it can be shown [1, pp. 215-216] that log 1,(w) converg 
to a finite value with probability one under eith 
hypothesis. If 


» Nid, — b,)° Ie } 
1 
then log J,(w) converges to + © in probability wi 


respect to hypothesis 1 and — © in probability wi 
respect to hypothesis 0; 7.e., the singular case obtains. , 
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Thus a maximum likelihood test consists in determining 


vhether 
‘im i 4 ole eect! ts x) dt 


s less than or greater than zero; if the limit is + © or 
- o the test gives the correct hypothesis with probability 
me. Now, 


i en vat = I aa 


I 


Ds (a, — by)¢x(s) 


»y Mercer’s theorem. Hence, since x) — 2, is of integrable 
square on (0, 7), 


ta [ v6 LOGO AO. 


| 

| 

if now in addition f, converges in mean square to a 
‘unction f, then 


| [ is — t) f(t) dt = x(s) — x,(s) 


‘or a.e.s. In this case 


im i ep [xe sere = x | dt 


ce i jo] SO sd + AUS x | dt 


vith probability one; thus the test function f must be a 
solution of the integral equation above. Conversely if the 
ntegral equation has a solution f of integrable square, then 
his f will serve as a test function [1, pp. 218-219]. Let 


Rf(s) = 


[ v6 — t)f(t) dt. 


Now if the spectrum of r is broad and flat, f = x) — 2, is 
nearly a solution. That is, under this circumstance the 
aximum likelihood test is nearly a correlation test. 

If delay terms with fixed delay are present as well as 
he additive noise, it is still possible to obtain the likelihood 
atio using the same observable coordinates as above. We 
how this only for the simplest such case. Let 


s(t) = a(t) + ax(t — 7r) + n(d) 


here 7 is known. If 7 = 
N is 


0 the joint density of y,, ---, 


ie f (8) iB fil Yx aaa a0, KOS 
vhere 


Tt 
(Ga i ob) to(t == T) dt, 
0 


vhere f,, is the density function of a and f, is the Gaussian 
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density function with mean 0 and variance 1/\,. The 
density for 7 = 1 is obtained by substituting b, and for 
a, and 
1 
des [ e(tas(t — 7) dt 
for c,. By a Martingale theorem the ratio of the two 


converges as N — o so in theory one can use them to 
perform a maximum likelihood test. (2, p. 348ff]. 


Il. Estimation oF CHANNEL PARAMETERS 


In many situations it is unreasonable to assign prob- 
ability distributions to the channel parameters. Hence 
there is some interest in developing a procedure to estimate 
these parameters and then use these estimates to reduce 
the problem to a simpler one of hypotheses testing, that of 
choosing the mean of a Gaussian process. 

We first treat the case 


s(t) = x(t) + ax(t — 7) + nd) 


with 7 fixed by finding a minimum variance unbiased 
estimate for a of the form a* = c + (s, f). For a* to be 
unbiased we must have 


E,(o*) =k + @, f) + ale,,, f) =a 


for all a, where E, indicates the expectation if a is true 
and we have written x;,, for x; delayed by r. The above 
is true if and only if 


(Gort) = Cin = ab 
and 


(Gos; i) — (Gees, f) = ll. 
Also 


var (f, n) 
[fr — ontasto asat = (RF, 0, 


var (a* — a) 


where 
rie = [Ro ~ of ae 


If for some a, b, and ¢, f satisfies the constraints above and 
Rf = Oly eet Oe a) 
then 
a* = (s, f) — (@o, f) 
is a minimum variance estimate. / or if 
Qt =f eto) 
is another unbiased estimate; we have 
var (h, n) = (Rh, h) = (Rf, f) 
x Rh =f) SRO ee 


var (@* — a) = 
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but the last term is nonnegative because of the nature of 
the operator R and the middle term is zero since 
(Ri J) = ac, Ter Ole le 7) 

=e C((scow =) ole ean 
In particular if f;, f2, and f, are solutions of Rf, = 2,., 
Rf, = «1,,, Rfs = % — % then af, + bf. + cf3 will give 
a minimum variance estimate if a, b, and ¢ satisfy the 
albegraic equations 


(af, a5 bf. —- Cfs, Ly — Ly) = 0 
(af, oF bf. Sr Gling Bors) = | 
(af, = bf. ae Che Tae) = 1. 


The above model is a special case of a more general one 
which we consider in the following paragraphs. There we 
show a method which is easier to apply than the above 
for obtaining unbiased consistent estimates of channel 
parameters. These estimates are not in general of minimum 
variance, however. 

Let 


fay = eG EP ieyeeac 


dF being any Stieltjes measure confined to a reasonable 
interval. We assume that 


Pool) = pii(t) = ae a,t’ 
7=0 
N ; 
Proll) = Port) ae » bit 
~=0 
and write c; = a; + b;. We assume below that 


T 


fi Ll PS ee was 


at 


There is a class of functions {x(¢)} for which this is true, 
of course, and it is true asymptotically for any function 
which might represent a conventional signal. Then, 


1 


a.(i)a,(t = (r = a)) dt = p;(7 =o). 


Zs 

(Soest Cie, oes / (r — oc) AF(c) + (n, %,, + 21,7): 
i=0 

Forming the inner product above for N + 1 different 

values of 7 and expanding the terms (7 — a)’ gives a set 

of equations of the form 


S; = (s, Voir; + Pe) 
N 
= >» Qj \ es dG) ki ee 
k=0 v 


where each a;, is a linear combination of the c’s. Setting 
the noise term to zero and solving the resultant set of 
equations gives unbiased estimates po, ---, wy for the 
first N moments of dF’ and leads to the approximations 
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N 
ae Ab, + (s,n) if 1 =0 
k=0 


(S, Xo) = 


I 
_—_— 


m 
Ss Dies ECS, 20) ieee 
k=0 


Thus we are left with the problem of testing for the me; 
of a Gaussian random variable. In Appendix II we pro 
the consistency of these estimates as T — «. If T remai 
fixed and successive estimates are averaged, the avera 
estimates are also consistent. 

We illustrate the above technique in a simple cas 
Suppose 


Pool t) = 4 + ayt, poi?) == 05 bt, 


then 


= (S$, Zo + 4) = Cee re Lise) 


oc, 


Mo 


(oC, —"Cy)(S ome es CaS, Voneicts Dine) d 
i= 3 
a Cy 


These estimated moments of dF may then be used, f 
example, to give estimated values of the mean of tl 
normally-distributed statistic (s, x). In particular: 

Estimated mean of (s, %) = Goo + Gm, if %» was sent, 
= Dowo + bin is 2, was sent. 


The hypothesis test remaining is to choose which of the 
mean values is most likely. 
In the problem considered above where 


s(t) = 2,(t) + art — 7) + n(d) 


we have dF(0) = 1, dF(1)) = a. Then by equating ti 
estimated moments to the true moments we have 


/ dF (1) | 


ree [are Sb nen | 


\ 


l+a 


Ho 


whence 1 — wo is an unbiased estimate for a. Then usii 
this estimate for a in the expression for s(t) we have le 
a hypothesis testing problem which approximates ¢ 
original one but is simpler; it is in fact precisely t! 
problem discussed in Section I. : 
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Suppose n(t) is filtered white noise; 7.e., 


HOLS / Clu) det + wo), 

a | 
where C(u) is the filter impulse response which vanish 
for uw > 0 and which we assume is integrable and squa 
integrable, € is a stochastic process with orthogor 
increments and the integral involved is a_stochas 
integral [2, p. 539]. Then 


9565 
p(t — s) = E(n(d)n(s)) 
= iL ; CeCe Sea) du. 
Bee) Whe eizentunctions of 
[os — 9700 at =} 909 


pan the whole L, space. 
roof: Let 


Ri) = | os — osw at 
nd suppose Rf = 0 for some f so that (Rf, f) = 0, then 


o- f aso [ HB = OO 


=f af af acm ow ss070 


2 ie a [ Fae he oso) Ce af) 


=f[ al sfP a 


There 


fw = | Cw 9fie) as 
and f is f on (0, 7) and 0 outside. 
ut the Fourier transform of Sf is the product of the 
ourier transforms of C and f and the transform of C 
nnot vanish on any set of positive measure by a theorem 
f Paley and Wiener [3, p. 739] so the transform of f 
anishes everywhere; 1.e., f = 0. Now suppose g is 
thogonal to all the eigenfunctions and hence to all 
unctions of the form Rh. Then (Rg,h) = (g,Rh) = 0 for 
hso Rg = Osog = 0. 
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In order to investigate the consistency of the test for the 
oments of dF we suppose that for all 7 in the interval 
here dF has positive measure and for 7 = 0 we have 


eh ered 
po(7) = Jima i) errant at 
hanes ee 
= lim 7 | a(t + 7)a,(t) dt = 0B. O:T 
d 
(7) = Tim ; tf Pee pdt 


l| 


Te N 
lim = [ Ja Cps eae atiys 
b ) 
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and for each 7 we choose polynomials 
N 
3s) = Do a(D)r’ 
and 
N ; 
pile) = 2 bAP)e' 
7=0 


in such a way that a;(7T’) — a; and b;(T) — b;. We choose 
To, ***) T SO that the equations, 


ieee 
lim 5 [ OG ae OL a 


N 
ey Dan f 7'gdF(r) 


will have a solution; that is, matrix A = (a,;) will have 
an inverse. Let s(7’) be vector whose jth component is 


pf slut + 1) + ae + rl ae 


and let A(7’) be the matrix made up in the same way that 
A is using a;(7’) and b,(7) instead of a; and b;. Then for 
large enough 7, A(7’)~* exists since A(T) — A and “we 
Set =A aes Can 

Theorem: If r(@) = O then the estimates 


N 


mT) = DiadT)u(T) and m¥(T) = 


i=0 


2d b(T)u(L) 


for the means of (s, ) and (s, 2), under the assumption 
that the autocorrelations over the interval 0 to 7 are 
given by po and pi, are consistent. 

Proof: Let n(T) be the vector whose jth component is 


val nb oot fon ae ee ea 
and let @(7) = A(T)~* (s(T) — n(T)), then 


var (m3(T) — m(T)) 


var bs a(T) p(T) — / po(7) ar(a)) 


I 


var is a(T)(u(T) — a(T)) 
4 De a.(t)(a.(2) = / 7 aP(2)) 


+f (Date = ete arco |. 


For large enough 7 
var (>, a(T)(u(T) — a.(P))) 


; 1 
< 2 m,ax; | a; |? N” m,ax; var (af nt) [ao(t + 7;) 
0 


+ a(t + 1)] dt) 
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so it will be sufficient for this term to show that 
1 : 
"ig | / ris — ba,(t — r)a,(s + 7) ds dt > 0 


for any a, b, and 7. By Schwarz’s inequality the square 
of this is dominated by 


a ( i ide at) 
eyes. areabes 
(af x(t + 7) ai(p x,(s + 7) as) 
0 0 
eee 
= 07 / rs — t) ds at) 


since the last two terms converge to po(0). Finally 
oT 


zt 
ee she 


which goes to zero by the dominated convergence theorem. 
Since A(7T)~* > A7?, 


wid 1 1 
[ r(s — t)dsdt = / / r(T(u — v)) du dv 


p(T) > fr dF) 


so the second term goes to zero, and the third term goes 


Correction | 


F’. L. Stumper’s paper, “A Bibliography of Information Theory (Communication 
Theory—Cybernetics),” which appeared in TRANSACTIONS OF THE IRE, Vol. [T-1, No. 2, 


Decemb 


to zero by the dominated convergence theorem whi 
completes the proof. 

Suppose that on each interval MT to MT + T one 
the signals av) or 2, is sent and we are required to deci 
at each MT which has been sent in the previous intervé 
If p and r are as above and for each M we choose a;(M° 
and b,(MT) to make the polynomials >> a;(MT)r‘ ar 
>> b,(MT)r° fit po” and pi” at chosen points oo, +++, 
so that a;(WT) — a; and b;(MT) — b, then the estimat 


DS, a(MT)y (MT) 
> b(MT)u(MT) 


are easily seen to be consistent. 


MD = 


m* 
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Tuckerman, L. 8. 
Unger, S. H. 
Vaughan, J. A. 
Wallace, Marcel 
Walston, C. E. 


Weber, Ernst, Prof. 


Weber, L. A. 
Whitenack, R. M. 
Whittle, R. L. 
Wilson, I. G. 
Winson, Philip 


Nordyke, H. W., Jr. 
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Wise, R. J. 

Wolensky, William 
Wolotkin, Aaron 
Wynne, W. M., Comdr. 
Zadeh, L. A. 

Zaslavsky, Sam 
Zemanian, A. H. 


Northern New Jersey 


Aaron, M. R. 
Anderson, G. M. 
Bagno, S. M. 
Bailey, R. 8. 
Baldwin, G. L. 
Bennett, W. R. 
Black, H. S. 
Brogan, J. M. 
Bucher, F. X. 
Burford, T. M. 
Busignies, H. 
Cahill, W. J. 
Caretta, P. M. 
Clavier, A. G. 
Compton, H. B. 
Cotellessa, R. F. 
Crofts, G. B. 
David, E. E., Jr. 
Davis, R. C. 
Dayem, A. H. 
De Graaf, D. A. 
De Rosa, L. A. 
De Vany, B. V. 
Dickey, D. W. 
Edison, T. M. 
Felch, E. P. 
Fischer, L. G. 
Goeller, L. F., Jr. 
Gordon, M. J. 
Graham, R. E. 
Grandmont, P. E. 
Groce, J. C. 
Gross, F. J. 
Grossman, A. J. 
Grower, B. B. 
Guenther, Richard 
Hartley, R. V. L. 
Hays, F. R. 
Heiser, W. H. 
Hibbert, J. J. 
Hoberman, M. J. 
Hulst, G. D. 
Jensen, A. G. 
Kamentsky, L. A. 
Kang, Chi Lung 
Kreer, J. G., Jr. 
Kretzmer, E. R. 
Kups, E. F. 
Lewis, R. F. 
Lundburg, F. J. 
Lupo, F. J. 
Martin, J. F. 
Martin, 8. J. 
Mathews, M. V. 
McMillan, Brockway 
Meacham, L. A. 
Meier, W. L. 
Mock, W. C., Jr. 
Montcalm, 8. R. 
Moore, E. F. 
Murdock, B. B. 
Nestor, George 
Perry, ASD Jn 
Rea, W. T. 
Rizzo, J. E. 
Rogoff, Mortimer 
Ronald, T. T. 
Rubino, J. W 
Saloom, J. A. 
Sauber, J. W. 
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Schott, L. O. 
Schwartz, FE. I. 
Seckler, H. N. 
Shangraw, C. C. 
Shannon, C. E. 
Sherman, Seymour 


Siekanowicz, W. W. 


Slepian, David 
Sudduth, W. B. 
Sullivan, H. J. 
Sweeney, W. R. 
Terry, C. B. 
Thompson, T. H. 
Treuhaft, M. A. 
Turkheimer, P. M. 
Van Tassel, HB. K. 
West, 8. 8. 
Westbrook, KE. P. 
Wickliffe, P. R., Jr. 
Wing, Omar 
Wirth, C. H. 
Zayac, F. R. 


Princeton 


Bellew, F. N. 
Cadden, W. J. 
Ceres, G. V. 
Clement, P. R. 
Fich, Sylvan 
Gilchrist, Bruce 
Lindsay, F. A. 
Mather, N. W. 
Shomer, J. EK. 
Strother, J. A. 
Surber, W. H., Jr. 
Sziklai, G. C. 
Wong, S. Y. 


Region III 


Atlanta 


Corriher, H. A., Jr. 
Dasher, B. J. 
Emmert, R. A. 
Finn, D. L. 

Flynt, E. R. 


Mauldin, H. W., Jr. 


Roberts, T. E., Jr. 
Schultz, F. V. 
Walker, E. M. 
Widerquist, V. R. 


Baltimore 


Basil, 1st: 
Blasbalg, Herman 
Broady, 8. N. 
Browdin, M. E. 
Cichanowicz, H. J. 
Csepely, J. A. 
Hsterson, G. L. 
Gilbert, G. B., Jr. 
Glaser, E. M. 
Goetz, L. P. 
Gott, Euyen 
Herwald, 8. W. 
Huggins, W. H. 
Isley, C. T., Jr. 
Jenkins, Jobe 
Jones, L. G. F. 


Kovasznay, L. 8. G. 


Mester, J. C. 
Mott, O. B. 
Pierson, C. D., Jr. 
Pipkin, J. E. 
Pullen, K. A., Jr. 
Raboy, Bernard 
Rambo, S. I. 
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Rosenblum, I. I. 
Schultz, R. L. 
Terashi, Taro 
Thomas, J. F. 
Wedge, T. E. 
Worrell, Ie. A. 


Central Florida 
Anderson, G. F. 
Downs, J. W. 
Friedland, M. S. 
Hines, W. 8. 
Koning, R. I. 
Lahn, F. C. 
Lloyd, L. H. 
Neidert, J. H. 
Painter, Parker, Jr. 
Rogers, C. L., Jr. 


Huntsville 


Clements, C. C. 
Greenwood, T. L. 
Schumann, Fred 


Miami 

Kush, L. J. 

Wise, D. O., Jr. 
North Carolina- 
Virginia 

George, R. T., Jr. 
Gremer, C. E., Lt. 
Harris, O. R. 
Hastings, C. I. 
Passera, A. L. 
Pitsenberger, J. W. 
Philadelphia 


Annett, M. E. 
Bachofer, H. L. 
Balakrishnan, A. V. 


Bargellini, I. P. L., Dr. 


Beaumariage, D. C. 


Benner, R. H., Dr., II 


Bensky, L. 8. 
Berkowitz, R.S. 
Bose, A. G. 
Bradford, C. E. 
Buchanan, J. P. 
Bucher, T. T. N. 
Byers, A. C. 
Cecala, J. A. 
Charton, Seymour 
Cohen, Irving 
Cramer, B. G. 
Deutsch, Joseph 
Drenick, Rudolf 
Khrich, W. G. 
Kichert, E. S. 
Emslie, N. M. 
Foley, G. M. 
Friend, A. W., Dr. 
Gans, J. 8. 
Gardner, R. K. 
Garrett, LL. B., Jr: 
Gaynor, Emil 
Gollub, Raphael 
Gottschalk, J. M. 
Greene, A. H. 


Greenfield, Alexander 


Halev, Irvin 
Hobbs, L. C. 


Holshouser, J. R., Jr. 


Honda, Hajime 
Hopper, G. M., Dr. 
Hurford, W. L. 
Kassler, Raymond 
Katz, Abraham 


Katz, E. S. 
Kerfoot, B. P., Jr. 
Ku, Yu H. 
Lazinski, R. H. 
Linden, D. A. 
Markarian, Hagop 
Maron, Irving 
Mathes, R. FE. 
Meredith, B. E. 
Merrihew, H. W. 
Mertens, L. KE. 
Million, J. W., Jr. 
Morton, W. B., Jr. 
Mrazek, D. A. 
Noel, R. Y. M. 
Patterson, G. W. 
Price, Robert 
Ringer, H. N. 
Roberts, F. R. 
Roberts, M. J. 
Rogers, R. F. 
Roseman, H. M. 
Rosen, George 
Rosenzweig, G. I. 
Ryan, Vie Parr 
Sachs, H. L. 
Schroeder, R. P. 
Schulz, R. B. 
Showers, R. M. 
Sorkin, C. S. 
Steinberg, B. D. 
Tou, Julius, Dr. 
Urkowitz, Harry 
Weisbecker, J. A. 
Weiss, Eric 
Williams, A. J., Jr. 
Wolin, Samuel 
Woll, H. J. 
Yamada, Hisao 
Yang, Tsute 


Washington, D. C. 


Alde, R. O. 
Alderson, W. S. 
Ballard, A. H. 
Bates, J. F. 
Bauer, P. S., Cdr. 
Baxter, J. L. 
Beck, H. M. 
Blackburn, C. A. 
Blair, C. R- 
Calhoon, T. G. 
Campanella, S. J. 
Cook, F. W. 
Cooper, H. W. 
Cowan, Bryan, Lt. Col. 
Donnell, W. F. 
Elbourn, R. D. 
Fine, Harry 
Finn, P. L. 
Finney, W. J. 
Fleming, J. J. 
Friedberg, I. 8. 
George, S. F. 
Gerig, J. S. 
Geselowitz, D. B. 
Gleason, R. F. 
Godsey, W. J. 
Goldberg, Harold 
Grimstad, E. J. 
Grisamore, N. T. 
Hafer, F. L. 
Hawthorne, G. B., Jr. 
Haydon, G. W. 
Headrick, J. M. 
Hedge, L. B. 
Hepler, D. S. 
Heyliger, G. BE. 
Hogan, D. L. 


Holman, J. G. 
Holmes, C. H. 
Horton, B. M. 
James, W. G. 
Karaoke 
Katzin, Martin 
King, A. M. 
King, W. P. 
Kirsch, R. A. 
Kirshner, J. M. 
Klein, M. H. 
Kohler, H. W. 
Kriz, Joseph 
Kuck, J. H. 
Kullback, Solomon 
La Pointe, J. C. 
Larson, R. C. 
Leiner, A. L. 
Levy, J. E. 
Lieberman, Gilbert 
Lord, J. B. 
Lun, M. J. 


McCracken, L. G., Jr. 


McGinnis, C. 1. 
McClurg, G. H. 
Melton, B.S. 
Moore, C. G., Jr. 
Neumann, A. J. 
Ould, R. 8S. 
Page, C. H., Dr. 
Page, R. M. 
RarrbeD: 
Peterson, H. L. 
Phillips, M. L., Mrs. 
Polak, Henri 
Potter, R. W. 
Powell, R. M. 
Reed, S. F. 
Rotkin, Israel 
Ryan, A. H. 
Salerno, James 
Schwartz, R. J. 
Scott, R. M. 
Scott, S. R. 
Scott, W. H., Jr. 
Sears, J. F. 
Sherertz, P. C. 
Smith, B. D., Jr. 
Smith, E. L., Jr. 
Spriggs, J. O. 
Stastny, G. F. 
Stockebrand, T. C. 
Sugar, G. R. 
Summers, C. R. 
Talkin, A. I. 
Tieman, C. R. 


Thompson, R. T., Jr. 


Wald, Bruce 
Waldschmitt, J. A. 
Weiland, F. A. 
Wilcox, R. H. 
Willard, J. M. 
Wright, W. W. 
Youden, W. W. 
Young, J. W., Jr. 
Zakowski, L. F. 
Zirm, R. R. 


Region IV 


Akron 


Bushnell, R. H. 
Diamantides, N. D 
Flowers, H. L. 
Gaul, E. R. 

Kult, M. L. 
Pressel, P. I. 
Richman, M. A. 


Decembé 


Cincinnati 


Beyer, J. P. 
Doerr, W. H. 
Mdwards, R. L., Jr. 


Grosch, H. R. J., Dr. 


Willsey, R. H. 


Cleveland 


Baerwald, H. G. 
Ehrman, J. R. 
Hare, V. C. M., Jr. 
labia. de dle 

King, C. F 

Kres, A. J. 

Laden, H. N., Ledr. 
Phillips, W. E., Jr. 
Pulaski, M. E. 
Wickenden, H. R. 


Columbus 


Chope, H. R. 
Clifton, J. K. 
Dawirs, H. N. 
Warren, C. E. 
Weimer, F. C. 


Detroit 


Batten, H. W. 
Bernhard, H. A. 
Book, Everard 
(Care Wa \Woy MM 
Cutrona, L. J. 
Deutsch, Menachem 
Farris, H. W. 
Garner, H. L. 
rilbert, E. O. 
Gilbert, E. G. 
Goodrich, R. H. 
Gottesman, N. H. 
Hok, Gunnar 
Jacobs, M. A 
Lindsay, W. J. 
Longerich, E. P. 
Macnee, A. B. 
McPherson, R. R. 
Morin, D. C., Jr. 
Nakagawa, Noriyuki 
Otterman, Joseph 
Piper, C. A. 
Rauch, L. L. 
Reiher, H. F. 
Rupcich, J. N. 
Schoderbek, J. J. 
Scott, N. R. 
Sharpe, C. B. 
Stewart, J. L. 
Szajna, KE. F. 

Aditi; dit. (Ce 


Emporia 


Baker, W. L. 
Byers, H. K. 
Golla, E. F. 
Harvey, H. B. 
Higdon, R. V. 
Kevyn @y Eee dr 


Knausenberger, G. EB. 


Lawther, J. M. 
Lopez, A. F. 
Myers, S. J. 


Pittsburgh 


Adams, C. W. 
Alexander, F. C., Jr. 


5d 


ight, R. L. 

my, DT. F. 

an, W. C. 

ra, D: J. 

nnon, G. F., Jr. 
ace, J. N. 

1et, F. R., Mrs. 
itchison, T. C. 
landbeck, R. F. 
otzbaugh, G. A. 
arlowe, KE. W. 
arshall, B. O., Jr. 
Donnell, T. J. 
hatz, E. R. 

in Nice, R. I. 
oodford, J. B., Jr. 


yledo 

ijii, Takashi 
illiamsport 
ebb, H. E. 


Region V 
edar Rapids 


vbillus, John 
»wenberg, EK. C. 
over, H. A. 
ilson, 1D), Vek 


nicago 

yerer, J. C. 
iderson, J. M. ~* 
all, RK. B. 

pram, R. E. 
pany, RR. F. 
elawa, F. R. 
prrowman, J. H. 
rauer, ple Jal. 
irlson, G. R. 

\ Robert 
lavier, P. A. 
phn, G. I. 

phn, Jona 
poney, J. J. 
ruz, W. S. 
restone, W. L. 
alejs, Janis 
prlach, A. A. 
nel, IDs AJ 
slovich, 8. J. 
upert, J. J. 
ryis, K. W. 
nes, R. W. 
arow, K. A. 
em, R. F. 
rson, R. W. 
wis, H. A. . 
ndholm, CAR: 
mod. H., Jr: 
ic. Henry 
ansfield, Ralph 
essinger, H. P. 
oon, R. J. 
orrison, Peter 
ison, W. R. 
ker, Tarik 
wadise, M. E. 
yey ELC: 
ichards, H. F. 
uina, Jack 
Itzberg, Bernard 
wahan, B. L. 


lapin, Theodore, Jr. 


ma, Raymond 


Sullivan, J. M., Jr. 


Tyksinski, 8S. P. 
Warnke, G. F. 
Webb, H. D. 
Weissman, R. M. 
Yuen, P. C. 


Dayton 


Barker, W. J. 
Bordewisch, J. F. 
Brown, G. T., Jr. 
Goldman, C. C. 
Gould, G. P. 
Haneman, V.&., Jr. 
Kott, W. O. 
Lockwood, G. C. 
McLennan, M. A. 
Piety, E. W. 
Thompson, C. W. N. 
Waugh, R. E. 


Des Moines-Ames 
Leverington, R. D. 


Fort Wayne 


Chambers, W. A. 
Johnson, D. L. 
Weidner, E. J. 


Indianapolis 


Clarkenoirvs 
Cooper, G. R. 
Fenimore, G. E. 
Goldstein, 8S. J., Jr. 
McCune, D. A. 
Ritt. Bs: 


Milwaukee 


Asmuth, J. L. 
Halijak, C. A. 
Herzog, Will 

Rideout, V. C. 
Scheibe, E. H. 
Theiss, C. M. 


Omaha-Lincoln 


Chamberlain, I. J. 
Wycoff, K. H. 


Twin Cities 


Bashara, N. M. 
Bratschi, R. W. 
Cohen, A. A. 
Essler, W. O. 
Featherstone, R. P. 
Gergen, J. L. 
Hardenbergh, G. A. 
Harris, L. A. 
Kelsey, J. R. 
Koschmann, A. H. 
Lode, Tenny 
Ludwig, J. T. 
Markusen, D. L. 
Morell, C. 8. 
Rowland, C. A., Jr. 
Schmitt, O. H. 
Youngquist, R. J. 


Region VI 
Dallas-Fort Worth 


Brachman, M. K. 
Brust, M. F. 
Dodd, J. M. 


Membership Directory 


Dolan, B. A., Lt. 
Fletcher, C. H. 
Gudzin, M. G. 
Harmon, F. I. 
Jones, H. J. 
Leming, T. L. 
Mitchell, W. R. 
Muerle, J. L. 
Mut, S. C. 
Wadel, L. B. 
Weedfall, W. W. 
Wright, T. A. 


Denver 


Carlin, P. W. 
Cottony, H. V. 
Daniels, W. H. 
Fulton, F. F., Jr. 
Slutz, R. J. 


El Paso 


Carbine, I. L. 
Kidwell, R. P. 
Kutrumanes, C. P. 
Lovitt, S. A. 


Mickleburgh, W .C., Pvt. 


Miller, W. E., Jr. 
Muehlner, J. W. 
Pyle, C. A. 
Sherburne, R. K. 
Smith, K. E. 
Thompson, D. I. 
Wagner, Nathan 
Yiarter bakes 


Houston 


Balle Ja: 
Chapman, D. C. 
Easterling, M. F. 
Harbaugh, A. W. 
Harker, H. J. 
Keating, L. M. 
Melloh, A. W. 
Pierson, A. L., III 
Rust, W. M., Jr. 
Tanguy, D. R. 
Weber, M. E. 
Wischmeyer, C. R. 


Kansas City 


Miller, H. G. 
Pulliam, C. S. 


New Orleans 


Dowe, R. J. 
Gordon, C. K. 
McLean, L. V. 


Oklahoma City 


Challenner, A. P. 
Daniel, D. B. 
Puckett, T. H. 


St. Louis 


Fordyce, S. W. 
Furfine, A. L. 
Hirsch, O. C. 
Keiser, B. BE. 
Little, G. R. 
Mohrman, R. F. 
Palmer, J. A. 
Roberts, E. L. 
Tedeschi, Anthony 
Van Bladel, J. G. 
Winter, D. F. 


San Antonio 


Douglas, J. H. 
Hoffman, A. A. J. 
Levin, M. J., Pvt. 
Ziemer, D. R. 


Tulsa 


Day, C. E. 
Kammerzell, C. E. 
Moxley, 8. D., Jr. 
Piety, R. G. 

Rice, R. B. 

Scott, C. B. 
Silverman, Daniel 


Region VII 


Albuquerque-Los 
Alamos 


Banksia Gar Ji 
Basore, B. L. 
Bidwell, H. H. 
Brown, W. E., Jr. 
Fursa, Alex 
Moore, R. K. 
Skinner, L. V. 
Williams, C. S., Jr. 


Inyokern 
Zilmer, D. E. 


Los Angeles 


Abramson, N. M. 
Ackerlind, Erik 
Adrian, D. J. 
Albrecht, Albert 
Allen, D. H. 
Andrews, L. A. 
Antell, Stanley 
Apa, F. E. 
Asawa, C. K. 
Ashby, R. M. 
Ausbourne, R. K. 
Avrin, J. 8. 
Babcock, D. F. 
Barnes, J. L. 
Beck, Leonard 
Bedrosian, Edward 
Bell, N. W. 
Begovich, N. A., Dr. 
BindsgtvaeNe 
Bower, J. L. 
Boyd, W. L. 
Brady, F. H. 
Braun, E. L. 
Brennan, L. E. 
Briggs, J. G. 
Brown, C. S. 
Burkart, E. H. 
Cain, G. H., Jr. 
Campbell, R. A. 
Carlson, C. O. 
Chu, Henry 
Colander, R. E. 
Culver, W. H. 
Cummings, C. I. 
Curl, G. W. 
Davis, F. W. 
Davis, Harold 
Davis, J. 8. 

De Lano, R. H. 
Dethlefsen, D. G. 
Deutsch, Ralph 
Diemer, F. P. 
Downes, Lloyd 
Drake, W. A. 


Duncan, D. B. 
Du Waldt, B. J. 
Edelsohn, C. R. 
Endsley, G. T. 
Epstein, R. A. 
Escher, P. H. 
Fatton, G. A. 
Fisher, R. L. 
Fishman, Max 
Foxman, Eugene 
Francis, J. P. 
Frankel, S. P. 
Fuller, R. H. 
Garber, L. F., Lt. 


Gardner, L. B., Dr. 


Gates, C. R. 
Gates, H. P., Jr. 
Gerardi, F. R. 
Gilchriest, C. E. 
Gola, A. S. 
Green, D. J. 


Greenbaum, Marvin 


Gross, William 
Hadden, F. A. 
Hance, H. V., Dr. 
Hannum, A. J. 
Hare, G. H. 
Hayes, W. T. 
Heilfron, Jack 
Hodson, W. G. 
Holly, C. M. 
Howard, S. L. 
Inouye, G. T. 
Jacobs, J. E. 


Jacobson, R. E., Jr. 


Joerger, J. C. 
Kelly, D. H. 
Klein, W. J. J. 
Klestadt, Bernard 
Knopoff, Leon 
Gall (CL US. 
Lader, L. J. 
Lambert, J. M. 
Lambert, J. D. 
Landman, R. M. 
Larse, G. L. 
Lehan, F. W. 
Lephakis, A. J. 
Levinson, R. M. 
Lew, Clinton 
Louie, William 
Low, Henry 
Lyons, L. H. 
Macintyre, R. M. 
Magnuson, V. P. 
Maki, G. J. 
Mathews, W. E. 
McCormick, G. F. 
McFarlane, M. D. 
McNabb, J. W. 
McRuer, D. T. 
McVey, B. D. 
Molloy, C. T. 
Moore, J. R. 
Moreno, C. A. 
Muchmore, R. B. 
Munushian, Jack 
Noland, A. R. 
Nunn, W. M., Jr. 
Nussbaum, O. N. 
Parker, N. F. 
Pfeffer, Irwin 
Pierson, J. E. 
Politi Have 
Proud, W. H. 
Rawlins, R. E. 


Rechtin, Eberbardt 
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Reedy, P. H. 
Reilly, Michael 
Roberson, R. E. 
Rosenstein, A. B. 
Rumer, W. I. 
Rutkin, B. B. 
Rypinsk, C. A., Jr. 
Salzer, J. M. 
Samuelson, H. R. 
Schalk, Norbert 
Schindler, Mark 
Schreiber, W. F. 
Seltzer, L. J. 
Sensiper, Samuel 
Silva, L. M. 
Slogar, J. W. 
Snyder, W. A. 
Starr, A. R. 
Stimpson, L. D., Jr. 
Stoehr, W. F. 
Stoltz, J. R. 
Stoltz, P. G. 
Taber, J. H. 
Tatum, F. A. 
Thomsen, R. K. 
Toeppe, W. J., Jr. 
Trautman, D. L. 
Van Horne, T. B. 
Votava, Yaro 
Waddell, B. L. 
Wagner, D. W. 
Walp, R. M. 
Walquist, R. L. 
Wanlass, 8S. D. 
Ware, W. H. 
Warner, S. D. 
Wedel, J. J., Jr. 
Weidman, J. S. 
West, G. P. 
Westlake, P. R. 
Whiteley, T. B. 
Whitford, R. K. 
Wiggins, EK. T. 
Williams, R. D. 
Wohl, Jack 
Wood, B. C. 
Wright, P. B. 
Young, C. W. 
Young, G. O. 
Zumba, C. F. 


Phoenix 


Albright, A. R. G. 
Bard, W. E. 
Brooks, H. B. 
Hammond, J. R. 
Morgan, H. L. 
Morrison, Fred 
Noon, J. R. 
Perper, Lloyd 
Ross, J. M. 
Thomas, S. M. 
Winkler, Stanley 


Portland 
Donoghue, J. J. 
Goldberg, P. A. 
Ropiequet, R. L. 
Strain, D. C. 

Salt Lake City 
Davidson, R. A. 
Marsden, R. 8., Jr. 
San Diego 


Austin, R. W. 
Caspers, J. W. 
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